WO1992000590A1 - Ante-memoire a acces selectif - Google Patents
Ante-memoire a acces selectif Download PDFInfo
- Publication number
- WO1992000590A1 WO1992000590A1 PCT/US1991/004484 US9104484W WO9200590A1 WO 1992000590 A1 WO1992000590 A1 WO 1992000590A1 US 9104484 W US9104484 W US 9104484W WO 9200590 A1 WO9200590 A1 WO 9200590A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- memory
- write
- cache
- address
- Prior art date
Links
- 230000015654 memory Effects 0.000 title claims abstract description 538
- 238000000034 method Methods 0.000 claims description 28
- 230000009977 dual effect Effects 0.000 claims description 7
- 230000002093 peripheral effect Effects 0.000 claims description 5
- 230000008878 coupling Effects 0.000 claims 1
- 238000010168 coupling process Methods 0.000 claims 1
- 238000005859 coupling reaction Methods 0.000 claims 1
- 230000000873 masking effect Effects 0.000 claims 1
- 238000012546 transfer Methods 0.000 abstract description 78
- 230000005012 migration Effects 0.000 abstract description 2
- 238000013508 migration Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 26
- 230000004913 activation Effects 0.000 description 25
- 230000000630 rising effect Effects 0.000 description 22
- 238000012545 processing Methods 0.000 description 21
- 238000012163 sequencing technique Methods 0.000 description 17
- 230000008520 organization Effects 0.000 description 12
- 239000000872 buffer Substances 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 230000001960 triggered effect Effects 0.000 description 10
- 230000004044 response Effects 0.000 description 9
- 230000000694 effects Effects 0.000 description 8
- 230000008859 change Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 230000002457 bidirectional effect Effects 0.000 description 4
- 238000013479 data entry Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 101100452681 Arabidopsis thaliana INVD gene Proteins 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 108050001922 30S ribosomal protein S17 Proteins 0.000 description 2
- 101100288173 Enterococcus faecalis (strain ATCC 700802 / V583) prs1 gene Proteins 0.000 description 2
- 101100465401 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SCL1 gene Proteins 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000003139 buffering effect Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 238000011010 flushing procedure Methods 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 101150016674 prs2 gene Proteins 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 206010000210 abortion Diseases 0.000 description 1
- 231100000176 abortion Toxicity 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0888—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using selective caching, e.g. bypass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
- G06F12/0879—Burst mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
- G06F12/0835—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means for main memory peripheral accesses (e.g. I/O or DMA)
Definitions
- This invention relates to memory subsystems, and more specifically to random access cache memory systems.
- bus transfer mechanisms are used by present day microprocessors.
- bus cycle includes at least two "bus states;" a bus state is the shortest time unit of bus activity and requires one processor clock period.
- the model 80386 microprocessor may be provided with either 16-bit or 32-bit wide
- each group of 16 bits is considered to be a physical word, and begins at an address that is a multiple of 2.
- each group of 32 bits is considered to be a physical "doubleword," and begins at a byte address that is a multiple of 4.
- Memory addressing is flexible, and
- a logical operand that spans more than one physical doubleword or one physical word, or that is a doubleword operand and begins at an address not evenly divisible by 4, or that is a word operand split between two physical doublewords.
- Dynamic data bus sizing is supported.
- the model 80386 microprocessor has separate, parallel buses for data and address.
- the data bus is 32-bits in width and is bidirectional.
- the address bus provides a 32-bit value using thirty signals for the thirty upper-order address bits, and four byte-enable signals to indicate the active bytes.
- Figure 1 shows basic two clock, no wait state, single read and write cycles.
- the first cycle a write cycle comprising bus states 1 and 2 is initiated when the address status signal ADS# is asserted at an edge of clock signal CLK in bus state 1.
- signal A2-A31 provides a valid address to the system memory; at a later time in bus state 1, signal D0-D31 makes available valid data to the system memory.
- the external system asserts the ready signal RDY#. In Figure 1, this occurs at the end of bus state 2.
- the second cycle a read cycle comprising bus states 3 and 4, is initiated when the address status signal ADS# is asserted at an edge of clock signal CLK in bus state 3.
- signal A2-A31 provides a valid address to the system memory.
- the external system asserts the ready signal RDY#. In Figure 1, this occurs at the end of bus state 4.
- FIG. 2 shows the use of wait states. Wait states are used because the bus cycle time of many commercially available microprocessors is much shorter than the
- the microprocessor must "wait" for the system memory to complete its read or write, which is accomplished by the insertion of one or more wait states into the bus cycle.
- the second cycle of Figure 2 which is a write cycle, includes three bus states 5, 6 and 7.
- Bus state 5 is analogous to bus state 1 of Figure 1
- bus state 7 is analogous to bus state 2 of Figure 1.
- Bus state 6 is a wait state, inserted because the ready signal READY# was not asserted until bus state 7. Additional wait states are asserted if necessary. The address and bus cycle definition remain valid during all wait states.
- the third bus cycle, a read cycle includes three bus states 8, 9 and 10, bus state 9 being a wait state.
- the model 80486 microprocessor provides a number of additional features, including an internal cache, a burst bus mechanism for high-speed internal cache fills, and four write buffers to enhance the performance of
- model 80486 microprocessor supports not only single and multiple nonburst, non-cacheable cycles, but also single and multiple burst or cacheable cycles.
- Burst memory access is used to transfer data rapidly in response to bus requests that require more than a single data cycle.
- a new data item is strobed into the microprocessor every clock.
- the fastest burst cycle (no wait state) requires two clocks for the first data item (one clock for the address, one clock for the corresponding data item), with subsequent data items returned from sequential addresses on every subsequent clock. Note that in non-burst cycles, data is strobed at best in every other clock.
- Burst mode operation is illustrated in Figure 3.
- a burst cycle a burst read in Figure 3 begins with an address being driven and signal ADS# being asserted during the first bus state 12, just as in a non-burst cycle.
- the cache line is four words.
- the burst mode is indicated when burst ready signal BRDY# is driven active and signal RDY# is driven inactive at the end of each bus state 14, 16, 18 and 20 in the burst cycle.
- the external memory in signaled to end the burst when the last burst signal BLAST# is driven active at the end of the last bus state 20 in the burst cycle.
- Cache memory systems have been developed to permit the efficient use of low cost, high capacity DRAM memory.
- Cache memory subsystems store recently used information locally in a small, fast memory.
- bus transfers are limited to the microprocessor-cache data path, system speed increases.
- Cache memory may be internal to the microprocessor, as in the model 80486 microprocessor of the Intel
- Microprocessor 22 is connected to a local address bus 24 and a local data bus 26.
- cache memory 28 is connected to the local address bus 24 and the local data bus 26. If cache controller 30 determines that data corresponding to the address requested is resident in the cache memory 28, the data is transferred over the local data bus 26. If cache controller 30 determines that the data is not resident in the cache memory 28, the address and data are transferred through cache bus buffer/latch 32A and 32B respectively to the system bus 34. Cache bus buffer/latch 32A and 32B are controlled by cache bus controller 36, which receives control
- System memory 38 and system peripherals 40 are connected to the system bus 34.
- Bus state 42 involves a cache fill, as established by activation of the cache enable signal KEN#.
- the signal BLAST# remains inactive during bus state 42.
- the first cycle terminates with the data transfer to the processor in bus state 44.
- Three additional data cycles consisting of, respectively, bus states 46 and 48, bus states 50 and 52, and bus states 54 and 56, are needed to complete the cache fill.
- Signal BLAST# remains inactive until the last transfer in the cache line fill, which occurs in bus state 56.
- Cache memory has been used for burst transfers, as shown in Figure 3.
- a cache fill is indicated when the signal KEN# is activate and the signal BLAST# is inactive during bus state 12.
- the signal BLAST# remains unknown in successive bus states 14, 16 and 18 and is activated only in the fourth successive bus state 20, so that four data items may be burst in succession.
- the external system informs the microprocessor that it will burst the line in by driving signal BRDY# active during the four successive bus states 14, 16, 18 and 20 in which data is transferred.
- main memory must be updated.
- the most widely used methods of updating main memory are write-through and write-back.
- writethrough main memory is automatically updated at the same time the cache is written. The processor must wait until the write is completed before it may resume execution.
- a variation, usually called "posted" write-through uses a buffer into which the write data is latched while the processor continues execution. The latched data is then written to main memory whenever the system bus is
- the cache memory system according to the present invention is compatible with a wide variety of bus
- burst mode a "demand word first" wrapped around quad fetch order is supported.
- the cache memory system of the present invention decouples the main memory subsystem from the host data bus, so as to accommodate parallel cache-hit and system memory transfer operations for increased system speed, and to hide system memory write-back cycles from the
- microprocessor Differences in the speed of the local and system buses are accommodated, and an easy migration path from non-burst mode microprocessor based systems to burst mode microprocessor based systems is provided.
- various memory organizations are accommodated, including direct-mapped or one-way set associative, twoway set associative, and four-way set associative.
- a memory cache apparatus comprises a random access memory, a host port, and a system port.
- the memory cache apparatus further comprises an input latch connected to the host port for selectively writing data to the memory and an output register connected to the system port for receiving data from the memory and selectively furnishing the data to the host port or to the system port.
- the input latch is a memory write register
- the output register comprises a read hold register for furnishing data to the host port and a write back register.
- a method for operating a memory cache apparatus wherein the memory cache apparatus includes a random access memory, a host port, a system port, an input register coupled to the host port, and an output register coupled to the system port.
- the method comprises the steps of latching input data into the input register from the host port, comparing a received address to a plurality of cache addresses, loading replaced data from the random access memory into the output register if the received address does not match one of the plurality of cache addresses, loading the input data into the random access memory, and providing the replaced data to the system port.
- a computer system comprises a host microprocessor having a host address bus and a host data bus, a system memory having a system address bus and a system data bus and a dual port cache memory having a system port connected to the system data bus and a host port connected to the host data bus.
- a cache controller is further connected to the cache memory.
- Figure 1 is a set of waveforms
- Figure 2 is a set of waveforms
- Figure 3 is a set of waveforms
- Figure 4 is a block diagram illustration of an external cache memory for a typical computer system.
- Figure 5 is a set of waveforms typically used in connection with the internal cache of the 80486 microprocessor.
- Figure 6 is a block diagram of a computer system based on the model 80386 microprocessor.
- Figure 7 is a block diagram of a computer system based on the model 80486 microprocessor.
- Figure 8 is a block diagram of the data path of a burst RAM cache memory in accordance with the present invention.
- Figures 9 and 9A are diagrams illustrating a direct-mapped, one-way associative cache having one bank.
- Figures 10 and 10A are diagrams illustrating a two-way set associative cache having two banks.
- Figure 11 is a diagram illustrating four burst RAM cache memory chips arranged in a 32-bit configuration.
- Figures 12A and 12B are diagrams illustrating
- Figure 13 is a diagram illustrating the host
- Figure 14 is a diagram illustrating the system interface between a system bus and cache memory 72 and controller 70.
- Figure 15 is a diagram showing the internal
- Figure 16 is a set of waveforms showing control and data signals for a single host port read operation.
- Figure 17 is a set of waveforms showing control and data signals for a host port burst read operation.
- Figure 18 is a set of diagrams showing control and data signals for a host port single write operation.
- Figure 19 is a set of waveforms showing control and data signals for a system port single read operation.
- Figure 20 is a set of waveforms showing control and data signals for a system port single write operation.
- Figure 21 is a set of waveforms showing control and data signals for a buffered host to system bypass
- Figure 22 is a set of waveforms showing control and data signals for a buffered host to system bypass
- Figure 23 is a set of waveforms showing control and data signals for a system to host port bypass operation.
- Figure 24 is a set of waveforms showing control and data signals for a system to host port bypass operation with reordering.
- Figure 25 shows an example cache line used in the operation of the system to host port bypass sequence of Figure 26.
- Figure 26 is a set of waveforms showing control and data signals for a system to host port bypass operation with update and partially dirty line.
- Figure 27 is a set of waveforms showing control and data signals for an advance write and subsequent quadfetch operation.
- Figure 28 is a set of waveforms showing control and data signals for a read tag miss operation with one write-back.
- Figure 29 is a set of waveforms showing an advance write operation with subsequent quad fetch and one write-back.
- Figure 30 is a set of waveforms showing control and data lines for an advance write operation with subsequent quad fetch and one write-back to a neighboring line.
- Figure 31 is a set of waveforms showing control and data signals for several operations occurring within the burst RAM cache memory.
- Figure 32 is a diagram showing internal blocks of cache controller 70.
- Figure 33 is a diagram showing functional block pin groups of cache controller 70.
- Figure 34 is a set of waveforms showing control and data signals for a system read cycle operation.
- Figure 35 is a set of waveforms showing control and data signals for a system write cycle operation.
- Figure 36 is a set of waveforms showing control and data signals for a buffered NCA write cycle operation.
- Figure 37 is a set of waveforms showing control and data signals for a controller register read operation.
- Figure 38 is a set of waveforms showing control and data signals for a controller register write operation.
- Figure 39 is a set of waveforms showing control and data signals for a 486 CPU burst read cache hit operation.
- Figure 40 is a set of waveforms showing control and data signals for a 486 CPU non-burst read cache hit operation.
- Figure 41 is a set of waveforms showing control and data signals for a read line miss and resulting quad fetch operation.
- Figure 42 is a set of waveforms showing control and data signals for a read line miss operation with no replacement.
- Figure 43 is a set of waveforms showing control and data signals for a read line miss operation with
- Figure 44 is a set of waveforms showing control and data signals for a multiple write-back cycle operation.
- Figure 45 is a set of waveforms showing control and data signals for a cacheable write hit cycle operation.
- Figure 46 is a set of waveforms showing control and data signals for a write tag miss operation with one write-back cycle and concurrent processing.
- Figure 47 is a set of waveforms showing control and data signals for a write line miss operation and resulting system quad fetch.
- Figure 48 is a set of waveforms showing control and data signals for a write tag miss operation with one write-back cycle.
- Figure 49 is a set of waveforms showing control signals for the optional address transceivers.
- Figure 50 is a set of waveforms showing control and data signals for a flush activation operation, followed by acquiring local bus and first write-back.
- Figure 51 is a set of waveforms showing control and data signals for a snoop read miss operation.
- Figure 52 is a set of waveforms showing control and data signals for a snoop read hit operation.
- Figure 53 is a state diagram illustrating the initial sequencing of the concurrent bus control unit of cache controller 70.
- Figure 54 is a state diagram illustrating sequencing during a read tag miss operation.
- Figure 55 is a state diagram illustrating sequencing during a write tag miss operation.
- Figure 56 is a state diagram illustrating sequencing during a read line miss and write line miss operation.
- Figure 57 is a diagram showing the state machines within bus controller 200.
- Figure 58 is a diagram showing the state machines within bus controller 202.
- a burst RAM cache memory in accordance with the present invention is suitable for use with a variety of microprocessors.
- a block diagram of a computer system based on the model 80386 microprocessor 60 is shown in Figure 6.
- the cache memory 72 consists of four
- Control signals are
- microprocessor 60 communicated between microprocessor 60 and cache
- the cache controller 70 over control line 64, and control signals are communicated to the cache memory 72 over control line 66.
- the cache memory 72 is also provided with a system port SP, which is connected over a bidirectional multiple signal line 74 to the system bus 34.
- the cache controller 70 is provided with a system bus control port CSB, which is connected over a bidirectional multiple signal line 76 to the system bus 34.
- FIG. 7 A block diagram of a computer system based on the model 80486 microprocessor 62 is shown in Figure 7.
- the system illustrated in Figure 7 is similar to that of
- address bus 27 is driven by microprocessor 62.
- the address bus 27 is driven through the system bus, during a cache invalidation cycle.
- FIG. 8 a diagram of the burst RAM cache memory chip in accordance with the invention is shown.
- the burst RAM memory chip is illustrative of each of the burst RAM memory chips 72A-72D of Figures 6 and 7.
- the three major sections of the burst RAM memory chip as shown in Figure 8 are the RAM array section 100, the address latches and multiplexer section 102, and the control logic and
- the burst RAM memory chip is also shown with control signal lines within each section which illustrate some of the control signals for control of the data paths. The details of how the control signals affect data flow are discussed below.
- the organization of RAM array section 100 is first considered.
- the RAM array section 100 is organized as a two-way set associative cache without data buffers, and includes two banks 106 and 108 of 2k ⁇ 36 bit static random access memory ("SRAM") (plus parity bits).
- SRAM static random access memory
- the two bank division readily accommodates for either a two-way set associative cache or a 64K byte direct-mapped cache.
- more burst RAM memory chips can be added for larger caches.
- Each bank 106 and 108 of RAM array section 100 is divided into four subarrays I-IV.
- Each subarray includes 2k ⁇ 8 bit memory locations and further includes a parity bit associated with each 8-bit memory location.
- An address bus 174 is connected to and provides addressing signals to each of banks 106 and 108.
- the addressing signals are decoded within each of the banks 106 and 108 to thereby select one of the 2k locations within each subarray I-IV.
- RAM array section 100 is suitable for many memory organizations, including a direct mapped cache organization and a two-way set associative organization.
- Associativity refers to the number of banks of the cache into which a memory block may be mapped.
- a bank also known as a frame, is the basic unit into which a cache memory is divided.
- a direct-mapped (one-way associative) cache such as that shown at 60 in Figure 9 has one bank 61.
- a two-way set associative cache such as that shown at 70 in Figure 10 has two banks 71A and 71B.
- a bank is equal to the cache size in a direct-mapped cache and one-half the cache size in a two-way set associative cache.
- a "page,” which corresponds in size to a cache bank, is the basic unit into which the physical address space is divided.
- page 0 is shown at 63A in physical memory 62, at 73A in physical memory 72, and at 83A in physical memory 82 respectively. Both memory banks and "pages" are subdivided into blocks; a block is the basic unit of cache addressing. Blocks are shown at 64A and 64B, and at 74A and 74B in Figures 9 and 10 respectively.
- Cache address information is stored in a directory.
- the single directory 65 includes 2048 cache address entries, each with three bit fields.
- the first bit field is a 12-bit directory tag for selecting one of the 2 EXP 12 pages of main memory.
- the second bit field is an 12-bit set address for selecting one of the 2048 sets in the cache.
- the third bit field is a 3-bit line address for selecting one of eight lines in a set.
- each of the two directories 75A and 75B includes 1024 cache address entries, each with three bit fields.
- the first bit field is a 17-bit directory tag for selecting one of the 2 EXP 17 pages of main memory.
- the second bit field is a 10-bit set address for selecting one of the 1024 sets in the cache.
- the third bit field is a 3-bit set address for selecting one of eight lines in a set.
- directory tag refers to that part of a directory entry containing the memory page address from which that particular block was copied. Any block with the same offset within a page may be mapped to the same offset location in a bank. The tag identifies the page from which the block came.
- set refers to all of the directory entries associated with a particular block off- set.
- the number of sets equals the number of blocks.
- a set has two entries, each pointing to a different bank.
- line refers to the basic unit of data transferred between the cache and main memory.
- a block consists of contiguous lines. Each line in a block has a corresponding "line valid" bit in the block
- Line valid bits are set when the line is written and are cleared when the block tag changes.
- the number of lines in a block and the line size is determined by the number of valid bits in each directory entry or by cache controller convention. For example, although each directory entry for the cache memory chip if Figure 8 could have enough valid bits to give each 32-bit
- the associated cache controller instead operates on four doublewords as a line.
- a "hit” or “miss” decision is based on the presence or absence of a line within the cache. It is noted that the cache controller 70 associated with the cache memory chip of Figure 8 designates two lines for each tag, and hence, there are eight thirty-two bit doublewords per tag.
- FIG. 9A Each physical address that the CPU 60 asserts can map into only one location in the cache.
- the address of each cycle driven by the CPU 60 is broken down into several components: the set index, tag, line select, and doubleword select.
- CPU address lines A15:5 (11 bits) compose the set index, and determine which cache location the address can map into.
- Address lines A27:16 (12 bits) are the tag. Assertion of a bus cycle by the CPU 60 generates a tag comparison, with A27.16 being compared against the tag for the given set index. A match of all 12 bits indicates a cache hit.
- Address line A4 selects between the two lines of a block.
- Address lines A3 and A2 select individual doublewords within a line, and are not included in cache hit/miss determinations.
- tag array bits, tag valid, doubleword valid, and the doubleword dirty bits are contained in the cache controller 70, while the actual data for doublewords 0-3 are contained in the cache memory 72.
- each physical address can map into two locations in the cache.
- the 2048 entries are mapped into two parallel sets of 1024 entries each.
- the cache set index is now 10 bits instead of 11, and the tags are 13 bits instead of 12.
- a bus cycle now triggers two comparison operations, one for each set, at the given set index, with a hit (match) possibly occurring in either set.
- the results of the comparison are OR'd together, with a high value from the OR output indicating a tag hit.
- each cache bank the number of banks, the number of blocks, and the number of lines per tag may be varied without departing from the spirit and scope of the invention.
- Address signal ADDR carried on the host address bus 24 is an 11-bit ( ⁇ 14:4>) signal that addresses the RAM array 100.
- An address multiplexer 103 and two address registers, hit address register 109 and miss address register 110, are included in section 102. Under certain circumstances, bits ⁇ 14:4> of the address signal ADDR are latched into the hit address register 109 and furnished therefrom through the address multiplexer 103 to RAM array 100.
- the address signal ADDR from the CPU 60 is latched into hit address register 109 through activation of the ADS# signal asserted by the CPU 60.
- the address information latched in hit address register 109 is provided to the address decoder of the RAM array 100.
- the 11-bit output of the hit address register 109 is latched into the miss address register 110 and furnished therefrom through the address multiplexer 103 to RAM array 100.
- the 11-bit address signal ADDR is furnished directly through the address multiplexer 103 to RAM array 100.
- Hit address register 109 includes a control line for receiving a signal CALE or the ADS# signal
- miss address register 110 includes a control line for receiving a signal MALE. A high assertion of the MALE signal causes latching of the address signal into miss address register 110.
- Miss address register 110 is used during miss cycles to latch the initial miss address. This address is later used during miss processing, as explained below, so that ndata retrieved from the system can be correctly updated into the RAM array section 100. It is. noted that the address in the hit address register 109 is latched into the miss address register 110 through the activation of signal MALE.
- the burst RAM cache memory chip is provided with a 9-bit (including parity) host port HP 113 which is connected to the local (host) data bus (bus 26 of Figures 6 and 7) through a 9-bit line.
- Each burst RAM chip is also provided with a 9-bit (including parity) system port SP 112 which transfers an appropriate byte of information between the system data bus (bus 34 of Figures 6 and 7) through its 9-signal line.
- the local and system data buses are four bytes or 32-bits wide, so thus four burst RAM chips are required to support the 32-bit systems as shown in Figures 6 and 7.
- burst RAM cache memory chip 72A supports the most significant byte of the local and system buses, burst RAM cache memory chip 72B the next most significant byte, burst RAM cache memory chip 72C the next most significant byte, and burst RAM cache memory chip 72D the least significant byte.
- a single doubleword (where each word comprises sixteen data bits) is stored in four bytes.
- Each cache memory chip 72A-72D stores one of the four bytes of a doubleword.
- Each doubleword is stored within a single set of the subarrays labelled either I, II, III, or IV.
- a "line” refers to the four adjacent doublewords stored within each set of the subarrays I-IV.
- the set of four subarrays labeled I of burst RAM chips 72A-72D contain the 32-bit doubleword located at the line's highest address.
- Subarray sets II, III, and IV contain the third, second, and first addressed doublewords in the line, respectively.
- the burst RAM chip data path further includes three sets of holding registers 114A-114D, 116A-116D and 118A-118D, and a memory write register 120.
- a memory read hold register set (“MRHREG”) includes four 8-bit registers 114A-114D and is provided to support data bus burst read operations on the host data port 113. Each of the four registers 114A-114D includes an additional bit for parity.
- MRHREG memory read hold register set
- MBREG memory update register set
- memory update register set 116 includes four 8-bit (plus parity) registers 116A-116D and is provided to accommodate quad fetch miss data operations from system memory 38.
- memory write register 120 is an 8-bit register (plus parity) provided to accommodate scalar write operations on the host data port 113.
- FIG. 12A and 12B corresponds to that of Figure 8, and includes hit address register 109, miss address register 110, RAM array section 100, read hold register set 114, memory update register set 116, write back register set 118, and write register 120.
- the generalized diagrams of Figures 12A and 12B may be referred to for simplifying the descriptions of the cache memory 72 contained herein.
- the burst RAM cache memory 72 and controller 70 in accordance with the invention, control the paths of data in response to various control signals. These signals,
- controller 70 from which and to which they are provided, are listed below in Tables I and II with a brief
- CCLK provides the fundamental timing and internal operating frequency for the controller 70.
- CCLK only needs TTL levels for proper operation. All external timing
- RESET Input RESET Input.
- the RESET input is
- A20M# Input Address 20 Mask Asserting the A20M# input causes the controller 70 to mask physical address bit A ⁇ 20> before performing a tag directory comparison and before driving a memory cycle to the outside world.
- the controller 70 emulates the 1 Mbyte address space of the 8086. The signal is only asserted when the host CPU is in real mode.
- This signal is connected to the A20GATE signal of most IBM PC/AT compatible chipsets.
- This signal directly connect to the 486 M/IO# pin.
- This signal is directly connected to the 486 D/C# pin.
- This signal directly connects to the LOCK# pin of 486.
- ADS# Input Address Status Output This signal indicates that the address and bus cycle definition signals are valid.
- ADS# is active on the first clock of a bus cycle and goes inactive in the second and subsequent clocks of the bus cycle. This input has a weak pull-up resistor.
- the controller 70 uses ADS# together with other ready indication signals to monitor 486 local bus activity.
- RDY# Input Local Bus Cycle Ready Input.
- assertion of RDY# indicates the completion of any local (Weitek) bus cycles. This signal is ignored on the end of the first clock of a bus cycle. This input has a weak pull-up resistor.
- a programmable option through the controller 70 allows asynchronous RDY# input. This asynchronous option allows a coprocessor with slow output delay to interface with the controller 70. In asynchronous mode, the controller 70 will forward RDY# to the RDYO# output in the next clock. In synchronous mode, RDY# will be forwarded to RDYO# in the same clock provided that setup time is met. After reset, RDY# is assumed to be synchronous.
- This signal is directly connected to the 486 RDY# pin.
- the controller 70 completes cycles such as cache read hits, write cycles (hit/miss) and cycles
- controller 70 control register update cycles. For cycles not directly completed by the controller 70, such as read misses and system cycles, the controller 70 will forward either the SBRDY1# or SRDY1# signals from the system memory bus as RDYO# to the CPU.
- SRDY1# or SBRDY1# can be returned by the system. However, either of these signals will be passed to the CPU as RDY0#.
- This signal is directly connected to the RDY# input pin of the 486.
- a numerics co-processor NPX
- READYO# of the NPX will be connected to RDY# of the controller 70.
- the controller 70 will then forward READYO# from the NPX to the CPU.
- BRDYO# Output 486 Burst Cycle Ready Output.
- the controller 70 drives this output on to the processor local bus indicating the completion of a
- controller 70 burst read hit data cycle. In cache subsystems using cache memory 72 Burst-RAMs, the controller 70 will forward the SRBRY1# signal from the system memory bus for cache read miss cycles.
- BLAST# Input Burst Last This signal indicates that the next BRDY# or RDY# returned will terminate the 486 host cycle.
- the controller 70 only samples BLAST# in the second or subsequent clocks of any bus cycle.
- This signal is directly connected to the 486 BLAST# pin.
- BOFF# Output Host CPU Back-Off. This signal is used by the controller 70 to obtain the 486 local bus. During snoop read hits, the controller 70 asserts BOFF# to the 486 one clock after AHOLD is asserted in order to access the cache data array. See section entitled “Snoop Operations" for a complete discussion.
- HOLD Output Host CPU Bus Hold Request. This signal is used for flush operations in order to obtain the local CPU bus. During either hardware or software flushes, the controller 70 will assert HOLD to the CPU. HOLD is released upon
- assertion of HLDA indicates that the controller 70 has been granted the local bus to begin flush operations.
- controller 70 will begin generating write-back cycles to the system to clear lines which contain dirty data.
- LBA# Input Local Bus Access This pin indicates to the controller 70 that the current bus cycle should occur only on the host (local CPU) bus. Assertion of this signal will prevent any system read or write operations from occurring as a result of the current cycle. However, this signal must be asserted to the controller 70 in the T1 state for proper operation.
- the FLUSH# input is
- This input has a weak pull-up resistor.
- This input clears all valid, dirty and LRU bits in the controller 70.
- the controller 70 will copy all dirty valid data back to system memory before executing the flush operation.
- This pin should be directly connected to the FLUSH# pin of the 486.
- PCD Input Page Cache Disable This pin provides a cacheable/non-cacheable indication on a page-to-page basis from the 486. The 486 will not perform a cache fill for any data cycle when this signal is asserted.
- PCD reads are cached by the controller 70.
- PCD read cycles which are cache hits are treated as normal cacheable cycles, with data being returned to the CPU in zero wait states. However, this data will not be cached inside the CPU.
- PCD write cycles generate buffered write-through cycles to the system.
- This signal is connected directly to the 486 PCD pin.
- PWT Input Page Write Through Assertion of this signal during a write cycle will cause the controller 70 to treat the current write cycle as a write-through cycle.
- a hit on a PWT write cycle will cause an update both in the data cache and main memory.
- a miss on a PWT write cycle will update only main memory, and will not generate a system quad fetch.
- KEN# Output Cache Enable. KEN# is used to indicate to the 486 if the data returned by the current cycle is cacheable.
- a cycle to a protected address region will cause the controller 70 to
- KEN# deassert KEN# in T1 to prevent the i486 CPU from performing any cache line fills. KEN# will continue to be deasserted until RDYO# or BRDYO# is returned, whether the cycle is a cache hit or miss.
- KEN# will be deasserted. Data passing between the 486 and the Weitek co-processor is not cached. Also, in order to support long instruction execution of the
- This signal is directly connected to the 486 KEN# input.
- the controller 70 will assert EADS# for main memory write cycles as indicated by MEMWR# signal being high from system memory bus interface.
- AHOLD Output Address Hold Request Asserting this signal will force the 486 to float the address bus in the next clock. While AHOLD is active, only the address bus will be floated, the data portion of the bus may still be active.
- the controller 70 uses AHOLD to attain address bus mastership for performing an internal cache invalidation to the 486 when system writes occur.
- This signal is directly connected to the AHOLD pin of the 486.
- HPOEA# Outputs Host Port Output Enables.
- HPOEB# signals are connected to the host port (G0#,G1#) data output enable inputs of Cache RAMs to individually enable the selected cache bank to drive the data bus. If cache memory 72 is used, these signals will be connected to the G0# and G1# inputs.
- HPWEB# are connected to the write enable
- HW0,HW1 inputs of the Cache RAMs in order to individually enable the selected cache bank to receive data. If cache memory 72 is used, these signals are connected to the HW0# and HW1# inputs.
- CCS ⁇ 3 0># Outputs Cache Chip Select. These signals are connected to the chip select inputs of the Cache RAMs associated with each byte of the data word. If cache memory 72 is used, these signals are connected to the SELECT# inputs.
- MALE Output Miss Address Latch Enable This signal activates after ADS# activation if either a read/write miss has been detected. Assertion of MALE and HPWEA# or HPWEB# in the same clock will inhibit writing to data RAM array section inside Burst-RAM cache memory 72..
- MALE Activation of MALE will reset all valid bits associated with each data byte in memory update register set 116.
- WBSTB Output Write-Back Strobe.
- the rising edge of (BMUXC ⁇ 0>) WBSTB results in the Burst-RAM data
- miss address register 110 to be latched into write back register set 118 (write-back register). This data will be written to main memory later if it is dirty.
- the controller 70 will assert WBSTB on the clock after MWB is asserted to allow Burst-RAM data entry associated with miss address register 110 to be latched into write back register set 118.
- read hold register set 114 read hold register
- This signal is directly connected to DW# inputs of the cache memory 72's.
- DW# is also used as a signal to the system memory controller during quad write cycles, to indicate dirty/non- dirty status of the corresponding doubleword. In the non-dirty case, the system can ignore the driven data and immediately return SRDYI# or SBRDYI#.
- This signal is connected directly to the BYPASS input of the cache memory 72.
- this signal is asserted one clock after SADS# to be compatible with 486 write cycles.
- This signal is connected directly to the MWB input of the cache memory 72.
- bits indicate the word address within a quad word. These bits are part of the address bit associated with data at the host port.
- CALE Output Controller ALE.
- CALE is generated by (HALE) the controller 70 during the first bus state of controller-initiated cycles.
- the controller 70 will generate CALE during flush operations and snoop read hits.
- the cache memory 72 Upon assertion of CALE, the cache memory 72 will latch the
- controller 70 -generated address.
- This pin is connected directly to the cache memory 72 CALE pins.
- SADS# Output System Bus Address Status.
- SADS# is the system equivalent of 486 ADS#.
- SADS# is asserted in the first clock of a system bus cycle. This pin is tristated when the controller 70 does not have system bus ownership.
- This pin is tri-stated when the controller 70 does not have system bus ownership.
- This pin is tri-stated when the controller 70 does not have system bus ownership.
- This pin is tri-stated when the controller 70 does not have system bus ownership.
- the controller 70 uses these inputs as snoop system bus address when some other system bus master controls the bus.
- the controller 70 will drive these address lines to system memory during miss processing, system and write- through cycles.
- the controller 70 will float SA ⁇ 27:4> during system idle states or at the end of system bus cycles.
- SA ⁇ 27:21> have weak pull-down resistors.
- SA ⁇ 27:4> indicates the address of each 32-bit doubleword within a quad line. They play a similar role as SA ⁇ 27:4> except during burst cycles, SA ⁇ 3:2> are wrapped around a line (16 byte)
- the controller 70 can control SA ⁇ 3:2> sequentially or use the 486 burstorder. This is controlled by a bit in the control register.
- SRDYI# indicates the completion of any non-burst system bus cycles. Simultaneously asserting both SRDYI# and SRBRDYI# signals will pass SRDYI# to the CPU. The controller 70 will forward only SRDYI# input to RDYO# for non-cacheable cycles. This input has a weak pull-up resistor.
- the controller 70 will pass SBRDYI# back to the host CPU as BRDYO#, except for system read cycles. For these cycles, assertion of either SRDYI# or SBRDYI# will be passed to the CPU as RDYO#.
- SBLAST# is asserted for all system bus cycles.
- SBLAST# is driven to a valid level in T1, instead of being indeterminate. Hence, SBLAST# will be valid in the same clock as SADS#.
- SBLAST# will be asserted during the last transfer of any system bus cycle.
- This pin is tri-stated during system bus hold.
- LOCK# is asserted by the CPU for indivisible read-modify-write
- This signal defines the direction of the optional system bus data
- This signal is connected to the DIR pin of the 646 data transceivers.
- the controller 70 will deassert SD OE# if another slave device on the system bus is granted bus ownership.
- the controller 70 will assert SLDSTB four times.
- SACP will be asserted only at the beginning of a burst cycle.
- transceiver outputs when the controller 70 is the current bus master and disables them otherwise.
- the controller 70 will deassert SA OE# if SHOLD is acknowledged.
- This signal is connected to OE# of the external address bus latch or external address bus latch/transceivers.
- SA DIR Output System Address Direction This signal controls the DIR (direction) input of the optional address transceivers.
- SA DIR is high when the controller 70 owns the bus and is low when the controller 70 grants ownership of the system bus.
- SA DIR toggles low or high the clock following change of SHLDA.
- SDOE# and SAOE# will be deasserted.
- controller 70 provides for partially valid lines. To support this, SA ⁇ 3:2> must be driven by the system to correct levels along with SA ⁇ 27:4> when SEADS# is asserted. CONTROLLER 70 SIGNALS
- Snoop writes: SEADS# and SMEMW/R# high should be asserted by the system during main memory writes.
- the controller 70 will forward SEADS# and the invalidate address to the CPU.
- a ⁇ 31:28> will be driven as zeroes on the processor local bus to properly invalidate the internal CPU cache.
- the controller 70 will assert EADS# for only one clock for each system bus snoop write.
- the SBE ⁇ 3:0># signals and write data should be driven to correct levels. This data will be updated into the cache data array if a snoop write hit occurs.
- Snoop reads Assertion of SEADS# and a low assertion of SMEMW/R# will result in the controller 70 asserting SMEMDIS if an address match has been detected in the controller 70 Cache directorry.
- the controller 70 will write dirty data associated with the driver address onto the system bus for correct operation. No EADS# assertion will be sent to the local processor bus for a system bus read cycle.
- the controller 70 will sample SEADS# every clock.
- This signal indicates whether a snoop read or snoop write is occurring.
- Snoop reads trigger SNPBUSY and a tag lookup, while snoop writes trigger an invalidate to the 486 CPU.
- the falling edge of SNPBUSY indicates that the memory controller should return SRDYI# or SBRDYI# to complete the snoop write operation.
- SNPBUSY remaining high indicates that the necessary tag lookup has not yet completed.
- the state of SMEMDIS indicates whether the controller 70 or the memory system will supply data to satisfy the snoop read request.
- this signal indicates that the current local bus cycle is addressed to the controller 70.
- These cycles include controller 70 Control Register Address Index load cycles and Control Register data read and write cycles.
- MCCSEL# is connected to local bus decode logic output for the controller 70 address cycles. This input has a weak pull-up resistor.
- Output data pins are connected to the least significant byte of local processor data bus. They are used to load the controller 70 Control Register index address and data content from the 486.
- D ⁇ 8:0> can be used as an output to read the internal states of the controller 70.
- controller 70 asserts SBREQ if an internal system request is pending.
- SBREQ is asserted the same time as SADS#. In case the controller 70 does not currently own the bus, SBREQ will be asserted the same clock that SADS# would have, had the controller 70 owned the bus.
- This signal plays the same role as 486 BREQ pin in local processor bus.
- ADS# Input Address/Data Strobe This is the latch enable strobe for CPU-generated bus cycles. The falling edge of this signal creates a flow-through mode for the hit address register 109. The rising edge of ADS# latches the address into hit address register 109.
- HALE CALE Input Controller Address Latch Enable. This (HALE) is the latch enable strobe for
- CALE opens the hit address register 109 level latch to ADDR.
- MALE Input Miss Address Latch Enable This signal should be asserted after ADS# or CALE activation if a miss for either read or write cycles has been detected. MALE performs several functions. First, it latches the address in hit address register 109 into miss address register 110. It also inhibits any host port writes from being performed on clock edges that it is sampled high.
- MALE latches bank select information in order to direct a subsequent update into the cache array. Activation of MALE with either HPOEA# or HPWEA# indicates that the next quad write (QWR) operation will update bank A, while MALE and either HPOEB# or HPWEB# selects bank B. Bank selection, as described above, must be qualified with assertion of the SELECT# input.
- MALE has a higher priority than does the MWB input. Assertion of both MALE and MWB on the same clock edge will result in recognition of MALE, but not MWB.
- Reset should be active for at least four CLK clocks for cache memory 72 to complete reset.
- WBSTB Input Write Back Strobe The rising edge of (BMUXC ⁇ 0>) WBSTB results in the cache memory 72 data entry associated with miss address register 110 being latched into write back register set 118. This data should be written to main memory if any is dirty.
- RHSTB Input Read Hit Strobe The rising edge of (BMUXC ⁇ 1>) RHSTB results in cache memory 72 data entry associated with hit address register 109 to be latched into read hold register set 114. This data will subsequently be burst to the 486 CPU.
- bits are part of the address associated with the data at the host port of the cache memory 72 , s.
- HPWEA# Inputs Host Port Write Enables.
- HPWEA# is the HPWEB#, host port write enable for bank A and (HWO,HWl) HPWEB# is the host port write enable for bank B.
- a low assertion of either HPWEA# or HPWEB# indicates the
- corresponding bus cycle generated by the host is a write cycle.
- HPOEA# Inputs Host Port Buffer Output Enables.
- HPOEB# HPOEA# will enable data from bank A for (G0#,G1#) the corresponding cycle, while HPOEB# will enable data from bank B. Both signals cannot be active simultaneously.
- SELECT# must be asserted in order to enable the host port outputs.
- These address bits are part of the address associated with the data at the system port of the cache memory 72.
- SW# (SW#) assertion of SPOE# indicates that the system port will be performing a write operation to main memory. For burst writes, SPOE# should be asserted for all four write data transactions.
- SELECT# must be asserted to enable the system port outputs.
- SPOE# acts as a direction control signal.
- BYPASS asserted high with SPOE# low creates a host-to-system port bypass, with the contents of write register 120 being driven onto the system port.
- BYPASS asserted while SPOE# is high will generate a system-to-host bypass, with system port data being passed directly to the host port.
- Quad Write results in the data residing in memory update register set 116 being written into the cache memory 72 data array entry, at the address pointed to by miss address register 110 and the cache memory 72 internal bank select logic.
- QWR overrides the mux control logic.
- QWR resets both the bank select information previously latched through assertion of MALE, and all mask and valid bits associated with memory update register set 116.
- SELECT# For any data outputs to be enabled, SELECT# should be asserted. For complex operations which utilize both the host and system ports, SELECT# should be asserted for the entire operation.
- MWB Input Multiple Write-Back.
- MWB should be used for write-back cache architectures where each tag entry corresponds to two lines. MWB is asserted during writeback cycles, in the case where the other line associated with the replaced tag is dirty and needs to be written back to memory. The assertion (high) of MWB toggles the A4 bit of the address stored in miss address register 110. Subsequent assertion of WBSTB then loads write back register set 118 with the second line of data to be written back to the system.
- the memory write register 120 can be used as a buffer on write cycles, for either buffered write-through cycles or buffered non-cacheable writes. For example, write
- the read hold registers 114A-114D are used to allow one clock-burst read operation from the cache memory 72. During the first transfer of a burst read, 32 bits of data are read into read hold registers 114A-114D. To complete the second, third and fourth transfers, the contents of the read hold registers 114A-114D are driven on the local CPU data bus 26, one doubleword at a time. A burst read hit causes all four 32-bit doublewords within the same line to be read into the read hold registers 114A-114D.
- the first demand doubleword is fetched and sent to the host port 113 directly from the RAM array section 100.
- Burst-order is controlled by signal HA ⁇ 3:2> and is totally transparent to the cache memory 72.
- the write-back registers 118A-118D are used to hide the write-back cycles that occur when system data from read misses replace dirty data.
- the cache memory 72 and controller 70 in accordance with the invention allow these quad writes to be burst to the system, if the system memory can accept such bursts.
- the write-back registers 118A-118D hold the data from the selected data line in the RAM array section 100 to be replaced.
- This data is written back to main memory if the replaced line contains valid and dirty data. This allows burst writes on the system memory port 112 without requiring access to the RAM array section 110 for write-back
- the RAM array section 100 is available to serve the host port 113 for local bus read and write hits.
- Memory update registers 116A-116D are used as a holding register for incoming data from system quad fetches due to read and write miss cycles. As each doubleword is returned by the system, it is passed on to the CPU 60 and latched inside one of the memory update registers 116A-116D. At the completion of the system fetch, all four doublewords are written into the RAM array section 100 by assertion of the quad write (QWR) signal.
- QWR quad write
- the memory update registers 116A-116D contain quad fetch miss data from main memory.
- the order of loading the memory update registers 116A-116D is controlled through signal SA ⁇ 3:2>, making burst-order to main memory purely transparent to the burst RAM. Once the line is loaded into the memory update registers 116A-116D, the entire line is loaded into the cache RAM array section 100 in one clock by activating the QWR signal.
- each byte of the memory update registers 116A-116D is both a valid and a mask bit.
- the functionality of the mask and valid bits is shown below.
- signal QWR is asserted (high) on a clock edge
- the contents of the memory update registers 116A-116D are updated into the RAM array section 100, as pointed to by the address data stored in miss address register 110 and the previously latched bank select information (explained below).
- Each byte of memory update registers 116A-116D is written to the RAM array section 100 if its valid bit is set (indicating valid data from the system port 112), and its MASK bit is cleared (indicating no advance write occurred for that byte).
- MASK & VALID NO WRITE Figure 15 shows the internal organization of the memory update registers 116A-116D.
- Mask bits are set during advance writes by assertion of MALE#, SELECT#, and either HPWEA# or HPWEB#.
- the mask bit set within memory update registers 116A-116D is selected by SA ⁇ 3:2>. Setting of the mask bits will be further described below.
- the valid bits are set by assertion of SELECT# and either SRDYI# or SBRDYI#. As with the mask bits, the valid bit set within memory update registers 116A-116D is selected by SA ⁇ 3:2>. A further discussion of the valid bits is also given below.
- the bank section of RAM array section 100 is next considered.
- Host port 113 reads from and writes to bank 106 are selected by HPOEA# and HPWEA#, respectively, and host port 113 reads from and writes to bank 108 are selected by HPOEB# and HPWEB#, respectively.
- HPOEA# signifies a write cycle from the host port 113 to bank 106 through write register 120
- HPWEB# signifies a write cycle from the host port 113 to bank 108 through write register 120.
- HPOEA# and HPOEB# are active low output enables.
- HPOEA# A low assertion of HPOEA# will gate the read data from bank 106 to host port 113 and the low assertion of HPOEB# will gate the read data from bank 108 to the host port 113.
- HPWEA#, HPWEB#, HPOEA# and HPOEB# cannot be active simultaneously.
- the burst RAM cache memory 72 uses previously-latched inputs for bank
- Detection of a CPU miss indicated by assertion of signal MALE (Miss Address Latch Enable), causes latching of bank select information. After signal MALE is
- the bank select information is latched within the burst RAM. Subsequent system port read (RAM array section 100 to write back) and write (memory update register set 116A-116D to RAM array section 100)
- control logic and transceivers section 104 further comprises four
- multiplexers 111A-111D for routing data, a plurality of control signal decoders 119, 122, 124, 128, 129, 150, 158 and 166, and a plurality of data path drivers designated with solid triangular symbols.
- Each of the data path drivers is singly identified and further described below.
- Cache memory 72 allows single read operations with the local bus processor through the host port. A single read operation is shown in Figure 16. The falling edge of ADS# activates the hit address register 109 into flow-through mode. Activation of
- SELECT# enables the internal bank select logic of the cache memory 72.
- the falling edge of HPOEA# or HPOEB# indicates to the cache memory 72 which bank will supply the data.
- the RAM array section 100 is accessed and read to the local processor through the host port 113. After access times of valid address to valid data delay, data valid delay from HPOEx#, and data valid from HA ⁇ 3:2> have passed, valid data is available on the host port 113 of the cache memory 72.
- Each cache memory chip 72A- 72D contains the internal 32-bit read hold register set 114A-114D to facilitate high-speed burst read hit
- Figure 17 shows a four-transfer burst operation.
- the first transfer of a burst read is accessed similarly to the scalar read previously described.
- ADS# assertion will activate hit address register 109 into a 'flow-through' mode.
- RHSTB should be deasserted (low) to allow data from the cache RAM array section 100 to bypass the read hold register set 114 and be sent to the host port data pins.
- signal RHSTB can be asserted. This has two effects. First, the entry in the RAM array section 100 pointed to by the hit address register 109 and bank select inputs will be latched into read hold register set 114 as the rising edge of RHSTB occurs. Second, RHSTB being held high (while BYPASS and WBSTB are low) will connect the output of the read hold register set 114 to the host port 113 data pins.
- Burst order are controlled by the HA ⁇ 3:2> inputs, with the burst order transparent to the MS443's. As HA ⁇ 3:2> toggles the next burst-order address, valid data from the read hold register set 114 will be available after ty has passed. RHSTB should be held high and WBSTB low to keep the read hold register set 114 connected to the host port.
- RAM array section 100 is available for system port operations.
- host port is available for use while miss processing is occurring on the system side.
- the cache memory 72 supports single host port write operations.
- the falling edge of ADS# sets hit address register 109 into "flow-through” mode.
- HPWEB# will cause host port data to begin flowing through the write register 120, as well as select which of the two banks of the RAM array section 100 is to be updated.
- HPWEA# and HPWEB# can trigger the write operation.
- Figure 18 shows a host port write operation to the cache memory 72. Activation of signal MALE will inhibit write register 120 data from being written into the data array.
- the write enable signals are sampled on every clock edge.
- a write into the RAM array section 100 will occur only on the rising edge of a clock, if MALE is inactive and either HPWEA# or HPWEB# asserted. As will be shown later, buffered writes can be performed by asserting
- HPWEA# or HPWEB# to latch data into write register 120, and then inhibiting the array write by asserting signal MALE.
- register set 116 will be set. This allows 'advance' writes to occur. Advance write operation is explained in more detail below.
- Cycles which require the burst-RAMs 72A-72D to supply data on the system port 112 may be accomplished as follows.
- Burst reads from the RAM array section 100 to the system port 112 may be accomplished.
- One case where burst reads from the RAM array section 100 to the system port 112 are necessary are for flushing the cache. These reads would begin as detailed above in the single read case.
- write back register set 118 Once write back register set 118 has been loaded with four bytes of data, these bytes may be burst onto the system bus byte-by-byte, with SA ⁇ 3:2> toggling to select among the four bytes. SPOE# should remain asserted, and BYPASS deasserted, in order to enable the contents of write back register set 118 onto the system port 112.
- Snoop writes are an example of system port writes cycles.
- Memory update register set 116 is loaded by the assertion of either SRDYI# or SBRDYI# on a rising clock edge. The byte of data appearing on the system port 112 is latched into memory update register set 116, as
- Activation of the DW# signal indicates that the cache memory RAM array section 100 contains dirty data at the same address.
- activation (low) of signal DW# inhibits dirty and valid miss data from being
- miss address register 110 is loaded and the bank selected, the data within memory update register set 116 may be written into the RAM array section 100. This write is accomplished by assertion of the QWR signal on a clock edge. The bytes of memory update register set 116 which did not receive any writes from the system port 112 will not be updated into the RAM array section 100, as the valid bit for these bytes is cleared. The quad write operation clears all valid and mask bits associated with memory update register set 116.
- Figure 20 shows a single write operation to the RAM array 100 through the system port 112.
- Memory update register set 116 is a 32-bit register, up to four bytes may be loaded into memory update register set 116 before its contents are written into the RAM array section 100. Assertion of SRDYI# or SBRDYI# on the clock edge loads the system port 112 data into the memory update register set 116 byte pointed to by SA ⁇ 3:2>. The four bytes of memory update register set 116 can be loaded in as quickly as four clocks. Dual-Port Operations
- the architecture of the cache memory 72 allows bypass operations. Bypass can occur in either direction as described below.
- Some host bus cycles may be designated as write-through cycles.
- the architecture of the cache memory 72 supports these cycles.
- Updates to the cache RAM array section 100 may occur during bypass operations as previously described in the section on Host Port Writes.
- the combination of either HPWEA# or HPWEB# asserted, SELECT# asserted, and MALE negated on a rising clock edge will generate a write into the cache RAM array section 100 during a host-to-system port bypass.
- Write operations to the cache memory 72 may occur as buffered writes. As described above, the falling edge of either HPWEA# or HPWEB# allows host port 113 data to begin flowing through the write register 120. Once write register 120 has been loaded with valid data, the buffered write may then be accomplished by asserting BYPASS high and SPOE# low. The contents of write register 120 will be driven onto the system port data pins while these two inputs remain asserted.
- buffered write operations may occur whether or not an update into the cache RAM array 100 occurs.
- Figure 21 shows a cache update due to a host port 113 write, with the write being buffered and continuing on the system port 112, until the system accepts the write data.
- Buffered write misses will be detailed in the write miss section; however, they proceed identically except for the MALE input. Unlike the BYPASS read case, in BYPASS writes, the state of the MALE input is recognized.
- Figure 22 details a buffered write operation, where no cache update occurs. MALE is asserted to inhibit the write operation. System to Host Port Bypass
- System-to-host port bypasses may also be generated.
- the requested data may be bypassed asynchronously from the host port 113 to the system port 112 in order to minimize the miss penalty and optimize performance.
- Use of the BYPASS path allows read miss processing to occur as quickly as possible, with no clock latencies between arrival of incoming data at the system port 112 and forwarding of the same data on to the host port 113.
- Designers should allow for the BYPASS propagation delay from the system port 112 to the host port 113, in addition to the normal CPU read data setup time.
- SPOE# When BYPASS is deasserted, SPOE# is used to enable system port data from write back
- SRDYI# or SBRDYI# sampled asserted (low) on a rising clock edge will latch system port data into memory update register set 116. Since memory update register set 116 is a 36-bit register, SA ⁇ 3:2> will select one of four bytes in memory update register set 116.
- system port data may be latched into memory update register set 116. This can be accomplished on rising clock edges by assertion (low) of either the SRDYI# or SBRDYI# inputs.
- the cache memory 72 does not differentiate between these inputs, and they are internally ANDed together. Assertion of either of these on a clock edge will latch system port data into one byte of memory update register set 116, as selected by SA ⁇ 3:2>, provided data has been valid one setup time previous to the rising edge of the clock.
- Memory update register set 116 can act as a buffer register if the burst-order between the 486 host and main memory is different. Memory subsystems using DRAM nibble mode are likely to use sequential burst order, unlike the i486 CPU. As each of the subsequent three doublewords of the burst are read from main memory, they are bypassed to the host port 113 if the burst orders are the same. A differential in burst order will result in data coming from memory update register set 116. Signal QWR can be asserted once all miss data from memory is updated into the memory update register set 116.
- Figure 24 details a read miss, where reordering occurs between the system and host ports.
- Some cache architectures may contain lines of data which are partially dirty.
- Figure 25 shows a line which is partially valid and partially dirty.
- the cache memory 72 must supply the data which is dirty, while the system must supply the portion of the line which is not present in the cache.
- the architecture of the cache memory 72 supports such situations.
- System-to-host port bypass cycles may be interrupted by negation of the Bypass signal.
- Bypass is negated, the cache memory 72 will supply data as selected by either HPOEA# or HPOEB#, the host port address, and HA ⁇ 3:2>.
- Figure 26 shows a CPU read miss, from the line detailed in Figure 25.
- the negation of Bypass causes the cache memory 72 to supply the dirty data from its RAM array 100.
- the cache memory 72 architecture supports "advance" host port writes in write-back cache architectures.
- Advanced write miss processing means that write miss data can be directly updated into the RAM array section 100 in cache memory 72. System fetches at the same address, in order to fill the remainder of the cache line, can occur subsequently and be written into the RAM array section 100 without overwriting the previously stored data.
- Mask bits can be used to support advance writes. As a write miss (SELECT# and either HPWEA# or HPWEB#) from the host port 113 (through write register 120) occurs, the mask for the corresponding byte (as selected by HA ⁇ 3:2>) in memory update register set 116 is set. Any subsequent fetches from the system port 112 will not overwrite the data now written into the RAM array section 100.
- register set 116 also corresponding to the least
- Figure 27 shows an advance write occurring.
- the architecture of the cache memory 72 allows for easy evacuation of dirty data into the write back register set 118. In addition, additional performance is
- Figure 28 details a CPU read miss cycle
- ADS# signals the beginning of the cycle.
- Signal MALE latches bank select information and the address into miss address register 110. Later assertion of WBSTB latches the data which is to be replaced into write back register set 118. Incoming data from the system port 112 is forwarded directly on to the host port 113 through assertion by signal Bypass. SPOE/ is held high to correctly enable the Bypass direction, and
- WBSTB The rising edge of WBSTB will trigger the latching of data (selected by miss address register 110 and previously latched bank select information) into write back register set 118.
- a write-back burst sequence should occur if the data replaced is dirty.
- ADS# The falling edge of ADS# signals the beginning of the cycle.
- HPWEA# and HPWEB# are asserted in order to select the bank to be written. However, before the write can occur, the dirty data must first be evacuated from the RAM array 100. As such, MALE should be asserted along with HPWEA# or HPWEB# to inhibit the write operation from occurring.
- the rising edge of WBSTB will trigger the latching of the data from the selected replace line into write back register set 118.
- a write-back cycle should occur if the data replaced is dirty.
- memory update register set 116 will be used to hold quad fetch data from main memory. Miss data will be fetched in a "wrapped-around" fashion with the demand word fetched first. Each byte of the memory update register set 116 is associated with a valid bit. As each byte is updated into memory update register set 116 through SRDYI# assertion, the corresponding valid bit will be set if signal DW# is active. This valid bit will qualify the corresponding byte of the memory update register set 116 to be written into the cache memory 72 data array as QWR is asserted. After each QWR (quad write) cycle, all valid and mask bits associated with memory update register set 116 will be reset.
- Figure 29 details a write miss, with evacuation of dirty replaced data and the ensuing system quad fetch. At the end of the quad fetch, the contents of write back register set 118 are written to memory.
- Figure 30 shows the same write miss cycle as before; however, at the end of the quad fetch, signal MWB is asserted in order to toggle miss address register 110 to point to the second line belonging to the replaced tag entry. After MWB is asserted, WBSTB loads write back register set 118 with the contents of this line. Finally a write-back cycle to the system of this newly loaded data occurs.
- the dual port architecture of the cache memory 72 is one of its most powerful features.
- the cache memory 72 is capable of processing on the system and host ports
- buffered write-through cycles can occur through the write register 120.
- the host port 113 can process read and write hits.
- system read and write requests can occur in parallel with host port operations.
- write- backs of dirty data can occur from write back register set 118 and be hidden from the CPU. While these write-backs occur, local CPU cycles can be satisfied on the host port 113.
- Figure 21 shows many of the features of the cache memory 72 in use simultaneously.
- a CPU write miss occurs on the host port 113. This miss will be written into the RAM array 100 through an 'advance' write.
- the dirty data in the RAM array section 100 which is to be evacuated is loaded into write back register set 118.
- a system quad fetch occurs to fill the remainder of the line that the CPU write updated.
- This system fetch is transparent to the CPU, and will be stored in the RAM array section 100 without overwriting the 'advance' write. While this fetch completes, the CPU generates another write cycle and a burst read cycle, which are both satisfied on the host port 113.
- HADDREG address latched into MADDREG. Latches bank select information for future SP
- controller 70 are next considered. The following sections further include the sequencing states of controller 70.
- Cache controller 70 is shown with a bus controller 200, a bus controller 202, an integrated tag array 204, a concurrent bus control unit 206 and data path control unit 208.
- Bus controller 200 interfaces with CPU 60 through corresponding address and control lines
- bus controller 202 interfaces with memory 61 through corresponding address and control lines.
- data path control unit 208 generates control signals that are received by cache memory 72.
- controller unit 70 is shown with specific input and output terminals adapted for a 486 microprocessor-based system. Controller 70 is shown with local processor interface unit 220, processor cache invalidate control 222, control register interface unit 224, burst RAM interface unit 226, system interface control unit 228, system address bus control 230, system data bus control 232, cache coherency control 234 and system bus arbitration unit 236. Each of units 220 through 226 provide and receive signals from the other system devices. Register Set and Programming Model
- the default configuration of the controller 70 is in write-back mode, configured for operation in a PC-AT environment. Any programming that is necessary will generally be carried out by the BIOS or operating system as part of the initialization process.
- controller 70 is initialized, few instances will arise where additional programming is necessary.
- the controller 70 contains the following registers.
- a control register determines the operating modes of the controller.
- An expansion register (XREG) offers cascade expansion mode to the controller 70 architecture.
- Eight address registers allow the controller 70 to offer four protected address regions.
- a protection register (PREG) defines the operating modes of these four regions. Availability of these regions eliminates the need for high-speed address decode PALs in a system.
- registers will be visible to application programmers, for system customization as desired.
- Registers in the controller 70 are accessed by a two-step register indexing method.
- the index address of the register to be read or written is written to the low I/O location assigned to the controller 70 (i.e., MCCSEL# asserted and A2 is low).
- the contents of that register can be read or written by performing the corresponding read or write I/O cycle to the high I/O location (i.e., MCCSEL# asserted and A2 is high).
- CREG ⁇ 0> Reserved This bit should always be set to a value of 1. A zero in this field is not supported. CREG ⁇ 1> ARDY Setting this bit will make the controller 70 RDY# input asynchronous. In asynchronous mode, RDY# input will be forwarded to RDYO# in the next clock. This allows the
- RDY# input will be forwarded to RDYO# in the same clock.
- CREG ⁇ 3> Reserved This bit should always be set to a value of 1. A zero in this field is not supported.
- Provisions in XREG allow an expansion configuration of up to 256K byte of cache memory using four controllers.
- the controller 70 will write back all dirty data
- controller 70 will clear this bit.
- NCA Non-Cacheable Address
- a cache flush should proceed the disabling, in order to flush any dirty data in the cache.
- the protection register defines the operation of the four available protected regions. Each protected region is associated with an NCA bit and a CWP bit. Note that either the NCA bit or the CWP bit may be set for a protected region, but not both.
- NCA bit or the CWP bit may be set for a protected region, but not both.
- NCA ⁇ 7, 5, 3, 1> NCA ⁇ 4:1> The setting of each of these four bits defines one of the four protected regions of the cache controller to be non- cacheable. NCA ⁇ 1> high will disable cacheing in the address range defined by PR1S and PR1E. Likewise, NCA ⁇ 2>, NCA ⁇ 3> and NCA ⁇ 4> will disable cacheing in regions two, three and four. PREG ⁇ 6,4,2,0> CWP ⁇ 4 : 1> Setting this bit defines the corresponding region as cacheable, but write- protected. Writes to an address region with the CWP bit set will be bypassed to the system by the controller 70. By use of this bit, system and video BIOS may be safely cached.
- the controller 70 can protect up to four address regions. Each protected region is defined through two 16-bit registers and two operating mode bits, NCA and CWP. The starting (low) address register and the ending (high) address register identify the address range of the protected regions. The starting and ending addresses of all four protected regions may be defined to 4K byte boundaries. For
- the PR1S and PR1E registers define the start and end of the first protected region.
- the second through fourth regions are similarly defined. These addresses are exclusive, and therefore not intuitive. For example, to use region 4 as a protected region from address 40000 hex to 7FFFF hex, the start address (PR4S) should be loaded as 003F hex and the end address (PR4E) should be loaded as 0080 hex.
- Non-cacheable address region 1 1 Undefined state; do not use An address not contained in any of the four protected regions is assumed to be a cacheable address. Settings such that the value of the starting address is larger than that of the ending address should be avoided;
- NCA Non-Cacheable Address
- the cache controller 70 is essentially ready for use in a PC environment.
- the default register values prepare the controller for use in write-back mode using 486 CPU burst order to memory and assuming use of cache memory 72.
- all that is necessary for the BIOS or operating system is to set bit 0 of the control
- the cache Before loading any of the non-cacheable region support registers, the cache should be temporarily disabled. This can be done by clearing the CE bit in the control
- the cache shold then be invalidated by setting the inv bit in the control register. This avoids any data coherency problem as a result of the non-cacheable region
- the cache can safely be re-enabled.
- controller 70 responds to the different types of bus cycles generated by the 486 CPU.
- the discussion covers normal cacheable memory reference (read/write) operations, locked, interrupt acknowledge and halt/shutdown cycles. Special 486 CPU cycles like Flush and Write-back cycles for supporting the 486 CPU INVD and WBINVD instructions are also described.
- the interface with cache memory 72 burst SRAMs is also discussed.
- the controller 70 supports 486 CPU systems with the cache memory 72 Burst-RAMs.
- the controller 70 only supports write-back mode for 486 systems. Write-through is supported on a cycle to cycle basis through the PWT input.
- the controller 70/cache memory 72 support burst reads for 486 cache line fills.
- the controller 70 will follow 486 style address sequence on the 486 CPU local bus. On the system bus, either 486-style or sequential address sequence is supported.
- Use of cache memory 72 allows miss operations and write-back cycles to be carried out in parallel with hits.
- M/IO#, D/C# and W/R# are the primary bus cycle definition signals from the 486 CPU. These signals are driven valid in T1, as ADS# is asserted.
- M/IO# distinguishes between memory and I#O cycles.
- D/C# distinguishes between data and code cycles.
- W/R/ distinguishes between write and read cycles.
- Three other 486 signals provide cycle definition to the controller 70. These signals are:
- PCD Peage Cache Disable
- the PWT pin is used by 486 CPU software on a per-page basis.
- the LOCK# pin which asserted indicates that the 486 CPU is performing a read-modify-write operation over several bus cycles. The 486 CPU should retain ownership of the bus while LOCK# is asserted.
- Table IV shows the encodings for the various bus cycles that occur in 486 CPU systems. Halt cycles have been moved to location 001 from location 101 for the 386 CPU. Location 101 is now reserved in the 80486. Table IV. 80486 Bus Cycle Definitions
- the controller 70 presents a very 486 CPU-like interface to system logic.
- the great majority of system interface pins have the same name and functionality as their 486 CPU counterparts. Because of these features, designing the controller 70 into a system is very straightforward. Glue logic to design in the chipset is minimal.
- hooks can be used to drive optional address transceivers and latches on the system side. These devices may be used if additional drive capability is desired; however, the controller 70 system side specifications assume 100 pF of loading.
- the SA OE# and SA DIR control signals are No Connect pins if address transceivers are not used.
- the controller 70 supports all i486 CPU functionality on the host (local CPU) side. A high-speed 32-bit
- the controller 70 has an identical list of pins as does the i486 CPU, except for the following pins:
- the 486 CPU architecture intends the PCD and PWT signals to be used by an external cache, and the system bus has no need of these signals.
- AHOLD is not needed in current system design.
- the preferred method of invalidating i486 CPU cache lines is through the SHOLD/SHLDA protocol, and will be discussed later.
- BS8/ and BS16# are not supported; systems must interface to the controller 70 with a 32-bit interface.
- PLOCK# is rarely used in systems. If it is desired to include PLOCK# in a system, a fast AND gate may be used to connect the LOCK# and PLOCK# outputs of the i486 CPU. The resulting AND will then be used as the LOCK# input of the controller 70.
- KEN/ is not necessary on the system side, as the controller 70 contains register support for four protected address regions, all of which may be either entirely non- cacheable or read cacheable write-protected. Addresses will be decoded by these registers and KEN/ returned to the i486 CPU in T1, in order to avoid any performance degradation. Use of the controller 70 in a system
- BOFF# or Back Off
- This functionality will be included in the next generation of the controller 70 family of controllers.
- controller 70 has a bus snooping feature and the ability to intervene on system snoop reads, when the requested data is both present in the cache and also 'dirty'. As a result, the controller 70 has three additional pins to support this functionality:
- controller 70 re-drives cycle on system
- SBLAST# is driven low, regardless of state of BLAST#
- controller 70 returns RDYO#
- controller 70 returns RDYO/
- controller 70 returns
- KEN# asserted to cause i486 CPU cache line fill
- controller 70 returns
- controller 70 asserts SBLAST#
- Controller 70 Response to 486 CPU Cycles - As shown le VI, the controller 70 distinguishes between four main classes of cycles. These four classes are system cycles, local bus cycles, write-through cycles, and normal (cacheable) cycles.
- the controller 70 detects a bus cycle as a system cycle through either one of the
- An NCA cycle is a read/write
- a cycle which is an I/O read or write, an
- Controller 70 disabled. If the CE bit in the control register is cleared (0), all cycles will become system cycles.
- the controller 70 forwards the address and bus cycle control signals to the system bus without performing a cache access. All read cycles are treated as read misses except that the cache directory and the cache data array are not affected. All writes are treated as write misses. NCA write cycles will be
- KEN/ is de-asserted to the 486 CPU. This will prevent the returned data from being cached in the CPU.
- SBLAST# will be low in ST1, regardless of the state of BLAST# from the 486 CPU. For most system cycles, this will have no impact, as the 486 CPU BLAST# output will be low for I/O, INTA, Halt/Shutdown, and normal write cycles. However, the effect of driving SBLAST# low means that non-cacheable address reads cannot be burst from system
- NCA non-burstable
- Figure 34 shows a system read cycle, where the cycle definition is passed on to the system.
- the controller 70 asserts SADS# the clock after ADS# was asserted.
- Bypass signal is asserted to allow returned system data to be passed back to the 486 CPU in the same clock.
- HPOEx# is enabled to allow the read data to be passed back to the 486 CPU.
- Figure 35 shows an I/O write cycle, which is passed on to the system without being buffered.
- SPOE# is asserted low to enable the system port.
- MALE is asserted high by the controller 70 in ST2 to inhibit the cache memory 72 write operation.
- I/O Cycles I/O cycles are passed on to the system bus, and terminated when the system asserts SRDYI# or SBRDYI#. Either of these signals are passed to the CPU as RDYO#. I/O cycles are not buffered. I/O cycles will not produce any cache operations. INTA (Interrupt Acknowledge) Cycles - The 486 CPU generates interrupt acknowledge cycles in locked pairs. The controller 70 will re-drive these to the system, with the same encoding as on the 486 CPU. Also like the 486 CPU, the state of A2 will allow system logic to
- A2 will be driven high during the first INTA cycle, and low for the second.
- SLOCK# will be asserted between and during both of these cycles.
- the controller 70 will invoke the Bypass signal and HPOEx# to the cache memory 72 during the second INTA cycle so that interrupt vectors are passed from the system bus to the local processor bus.
- Halt/Shutdown Cycles - The controller 70 treats halt/shutdown cycles as system cycles. During
- the controller 70 duplicates the encoding of the host processor on the system memory bus. External hardware should acknowledge halt/shutdown cycles through SRDYI# or SBRDYI# assertion. During halt/shutdown cycles, the controller 70 will recognize SHOLD from the system memory bus and will respond by floating the system address, system control definition signals and asserting SHLDA.
- NCA Non-Cacheable Address Cycles - The address of each bus cycle is compared against the contents of the protected address region registers to determine cacheability of the cycle.
- the cycle is determined to be an NCA cycle.
- the cycle will be forwarded to the system bus without any cache operation taking place. No data will be supplied from the cache in this case.
- KEN# is asserted high to the CPU to prevent data on read cycles returned from the system from being cached. KEN# being returned high in T1
- Bypass allows the buffered write data contained in write register 120 to be driven to the system.
- Figure 36 details an NCA write cycle.
- the write cycle is buffered and terminated on the local CPU bus in two clocks by the controller 70. Note that like a normal cache write hit, HPWEx# is asserted low in ST2. However, the write to the cache data array is inhibited by assertion of MALE. The write data is latched into write register 120. Write register 120 will act as the write buffer in this cycle. The cycle continues on the system side until terminated by SRDYI# or SRDYI#. The Bypass signal stays activated until SRDYI# or SBRDYI# is
- Local Bus Cycles - The controller's 70 second class of cycles are Local Bus cycles. Local bus cycles are not passed on to the system, nor do they cause a cache hit/miss determination. Local bus cycles consist of reads and writes to the controller 70 control registers, Weitek bus cycles, and writes to cacheable write-protected regions.
- Controller 70 Register Reads/Writes - Control words are read and written to the registers of the controller 70 through a two-step index addressing process. First, a write cycle is performed to the address of the controller 70. The data for this write cycle is the index address of the desired register. A2 should be low for this cycle. This indicates to the controller 70 which register is to be read or written to.
- a second cycle is then performed to the controller 70.
- This cycle performs the actual read or write to the control register.
- the data lines for this cycle contain or return appropriate read/write data.
- A2 should be high for this cycle.
- Figures 37 and 38 show reading and writing to one of the control registers of the controller 70.
- MCCSEL# should be decoded and asserted to the controller 70 in the T1 state.
- the controller 70 will return RDYO# to terminate these cycles. Both reads and writes will take on wait state.
- Weitek Bus Cycles The host CPU may execute bus cycles that access the Weitek math coprocessor (4167) in some 486 CPU systems. The controller 70 simply ignores these accesses and does not initiate any activity in the cache or on the system bus.
- the controller 70 recognizes bus cycles intended for the 4167 through the pattern of A ⁇ 31:25> being ⁇ 1100000>.
- the 4167 acts as a device in a reserved memory space.
- the 4167 generates its RDYO# signal to the controller 70, which is passed onto the 486 CPU.
- the local RDY# input from the Weitek coprocessor may be operated in an asychronous mode. In this mode, RDY# is latched and sent to the 486 CPU in the following clock, instead of in the same clock. This mode should be used if the synchronous mode RDY# setup time cannot be otherwise met. Bit 1 in the controller 70 control register controls this mode.
- the controller 70 defines a third class of cycles to be the write-through cycles.
- the controller 70 detects a write-through cycle through the following condition:
- the 486 CPU asserts either the PCD, PWT, or
- the controller 70 When a write-through cycle occurs, the controller 70 will always generate a write to the system bus, regardless if the cycle is determined to be a cache hit or miss.
- Write-through miss cycles will also be buffered writes, although RDYO# will be asserted after three clocks, instead of two as for write hits. Unlike normal write misses, write through miss cycles will neither update the cache nor generate system quad fetches.
- Normal (Cacheable) Cycles - The fourth class of cycles are the normal cacheable cycles. These cycles will be the great majority of cycles which occur. Normal cacheable cycles are the default cycles, and are assumed if a cycle does not fit into any of the previously
- controller 70 detects a normal cacheable cycle under either one of the following conditions:
- a write cycle which is not an I/O, Interrupt Acknowledge, Halt/Shutdown, PCD, PWT, or Locked cycle.
- writes whose addresses are contained in the protected address region registers without either the CWP or NCA bits sets are cacheable cycles.
- a read hit will result in either a single, double, or quad transfer to the 486 CPU.
- the number of transfers depends on the state of the internal 486 CPU cache, as reflected by the PCD pin. The meanings of the two states of PCD are described below:
- a quad doubleword transfer will result, in order to fill a line of the 486 CPU.
- KEN# is returned active (low) twice to the 486 CPU, first in T1 to initiate the line fill, and also the clock before the final transfer.
- the quad transfer will take five clocks with cache memory 72 burst-RAMs.
- the demand doubleword will be returned first followed by the other three remaining doublewords in the same line, following the 486 address order.
- BLAST# will be asserted by the 486 CPU during the fourth transfer.
- a quad burst transfer from 82C443 Burst-RAMs is shown in Figure 39).
- PCD 1: PCD being asserted high by the 486 CPU indicates no line fills will occur in the 486 CPU cache. KEN# will be returned high to the CPU in the T1 state. Either a single, double, or quad transfer will occur in this case, depending on the BLAST# output of the CPU.
- the controller 70 will monitor
- BLAST# from the 486 as each transfer is completed.
- BLAST# assertion (low) indicates that the cycle associated with the corresponding BRDYO# is the last data cycle.
- BLAST# assertion will terminate the transfers.
- a single transfer is shown in Figure 40. The first demand word will require two bus states. For each of the remaining doublewords, only one bus state is required. BRDYO# will be asserted for each returned doubleword. Single doubleword transfers will require two clocks and double transfers three clocks.
- a read cycle to a CWP region will result in the controller 70 returning the KEN/ output deasserted (high) to the i486 CPU. This will prevent the corresponding data from being cached inside the i486 CPU internal cache.
- the controller 70 will assert CCSx#.
- the bank being read is enabled by assertion of HPOEx#.
- the controller 70 will return BRDYO# to terminate each transfer to the 486 CPU, until BLAST# signals the end of the cycle.
- the first access will come from the cache data array.
- read hold register set 114 will be loaded with the data to satisfy the entire quad read. Data for the remaining second through fourth transfers will be read from the read hold register set 114.
- signal RHSTB will be asserted high in the second ST2 state.
- the controller 70 will drive HA ⁇ 3:2> to valid levels to provide 486-style address sequencing. HA3 and HA2 select a doubleword from the four in each line.
- HA ⁇ 3:2> should be used, since their valid delays are much shorter than those of A ⁇ 3:2> from the 486 CPU.
- Cacheable Read Miss Operations The tag comparison for a cycle may indicate the occurrence of a read miss. There are two kinds of misses, a tag miss and a line miss. A tag miss results when the tag lookup does not produce a match, or the tag valid bit is not set. A line miss occurs if the tag lookup produces a match and the tag valid bit is set, but any of the four doublewords
- Tag misses and line misses are treated differently by the controller 70.
- Cacheable read misses generate a system quad fetch.
- data that is retrieved from the system may replace older data existing in the cache. If any of the data which is replaced is both valid and dirty, one or two write-back cycles will occur.
- the controller 70 examines the LRU bits of the target entries to select which of the two banks is to be
- the controller checks if any one of the eight doublewords of the two lines which correspond to the selected tag entry are set "valid" and "dirty".
- the controller 70 will latch the read miss address into the cache memory 72's miss address register 110.
- the miss address together with the LRU bits select the data line to be replaced. This data is then latched into the write back register set 118 of the cache memory 72.
- the 486 PCD pin affects read miss operations similarly to the read hit case. PCD being low indicates a cache line fill for the 486 and cache memory 72 will occur.
- the controller will assert KEN# twice to the 486 CPU to perform a line fill, before the first and fourth transfers are completed. If PCD is high, no line will occur in the 486 CPU cache, although a line fill will occur in the cache memory 72. KEN/ will be returned high to the 486 CPU in this case.
- Cacheable read misses initiate quad fetches on the system side, in order to bring four doublewords into the cache data RAM and fill a cache line.
- This quad fetch will continue, eve if the 486 CPU does not require all four doublewords (i.e., the 486 CPU internal cache is turned off).
- the CPU may terminate the fetch on the host side after one or two transfers, while the quad fetch from the system continues until completion.
- the quad fetch is finished in order to increase the cache hit rate and, as a result, overall performance.
- the controller 70 will assert SBLAST# on the fourth transfer, to indicate the completion of the cycle.
- Figure 41 shows a read line miss with PCD asserted by the 486 CPU.
- a system quad fetch results and completes, although the 486 CPU terminates the cycle on its local bus after only two transfers. Note that another read or write hit could then be processed on the local bus while the system quad fetch completes.
- the write-back architecture of the controller 70/cache memory 72 isolate local CPU and system bus processing.
- signal MALE is asserted high in T2 to indicate that a miss has occurred.
- MALE latches the read miss information.
- Bypass is asserted high to allow data from the system quad fetch to pass on to the 486 CPU in the same clock.
- the addition of the Bypass propagation delay means that valid data setup time for the cache memory 72 must be greater than that of the 486 CPU.
- HPOEx# goes low to enable the host port outputs to bypass data received from the system.
- system logic can assert either SRDYI# four times or SBRDYI# four times to complete the transfer.
- SRDYI# four times
- SBRDYI# four times
- controller 70 will pass either ready input from the system to the 486 CPU as BRDYO#.
- BLAST# is monitored as each doubleword is transferred. BLAST# assertion will
- the system quad fetch from main memory will not be written directly to the cache data array. Instead, fetched data will be loaded into memory update register set 116. Since the line may have been partially valid and contained some dirty doublewords which should not be overwritten, incoming doublewords from system memory will be qualified before they can be written into memory update register set 116.
- the controller 70 will assert the DW# signal to indicate to the cache memory 72 that the fetched doubleword may be safely written into memory update register set 116. Inactivation of DW# inhibits dirty and valid miss data from being overwritten by system quad fetch data.
- the controller 70 will send the appropriate control signals to turn the data transceiver to receive mode.
- the address sequence to main memory will either follow 486 style address sequence or sequential address sequence. If the burst order at main memory interface is the same as the 486 (no re-ordering is occurring), the Bypass signal will be activated and the incoming data from main memory will be sent to the local bus through the bypass path inside the cache memory 72. At the same time, the data will be latched in a cache memory 72 register, to be updated into the cache memory 72 once the entire line is received.
- Figure 42 details a read line miss, with the
- controller 70 responding by activating a quad fetch on the system memory bus. Assertion of SADS# and the other control signals initiate the quad fetch. Bypass is activated, so that the data being returned from the system is forwarded on to the 486 CPU in the same clock. Because the read data must propagate through the cache memory 72 Bypass path and still meet 486 CPU setup time
- the read data setup time to the cache memory 72 is longer than that of a 486 CPU by itself.
- the SA3 and SA2 (system address 3 and 2) signals become valid early in each T2 state, much sooner than the 486 CPU would provide them.
- the read data is latched into the cache memory 72 memory update register set 116 register.
- controller 70 activates SBLAST# to terminate the cycle.
- the data in memory update register set 116 is written into the cache memory 72 RAM array section 100 by activation of the QWR (Quad WRite) signal.
- Figure 42 also details a complexity that will sometimes occur. On read line misses, the line is
- Figure 43 details a read line miss with reordering.
- a difference in address order between the main memory interface and the 486 will result in the first demand word to be sent to the local bus through the cache memory 72 bypass path.
- the second and third doublewords within the line will be latched into cache memory 72 memory update register set 116 as well as being sent to the 486 CPU.
- the fourth doubleword is then returned from main memory, passed to the CPU by way of the cache memory 72 bypass path and latched into memory update register set 116 simultaneously.
- the controller 70 will generate BRDYO# for these two remaining returned data cycles. Note that as in the previous figure, part of the line may be valid and dirty. The controller 70 will deassert Bypass and supply the 486 CPU with the correct dirty data during the appropriate transfers.
- the write-back architecture of the controller 70 requires additional considerations when this old cache data is replaced.
- a direct replacement algorithm would cause data to be lost if the replaced data is valid and dirty.
- Data is marked dirty if it has been modified by the 486 CPU but not yet been copied back to main memory.
- read tag misses and the subsequent loading of a new line of data into the cache from the resulting system quad fetch may be followed by one or two write-back cycles to main memory at the end of the quad fetch.
- the controller 70 allows these quad writes to be burst to main memory, if the system memory controller is capable of receiving such bursts.
- the controller 70 will generate either zero, one or two quad write replacement cycles to main memory, which may be bursted by the system. No write replacement cycle will be generated if the selected tag entry is invalid or both lines associated with the selected tag entry contain no dirty data.
- Address order on quad-writes may follow either 486 burst order or sequential burst order. Unlike write-back cycles which occur due to flush operations, the CALE signal will not be generated for write-backs due to misses, as the needed address is already contained in the cache memory 72 miss address register 110.
- Quad write transfers will contain a mixture of dirty and non-dirty doublewords.
- the controller 70 provides a signal to avoid performing writes which would contain non-dirty data (which the system already contains).
- the DW# signal is driven valid in all T2 states of write cycles. To simplify system design, DW# will become valid early in each state.
- DW# should be incorporated into two parts of system memory logic. First, DW# should be part of the write enable logic. DW# being low for a transfer indicates that the corresponding data is indeed dirty and should be written into the system memory array. writes in which DW# is driven low are processed normally by the system.
- the system may, at its option, accept this write data.
- DW# can be incorporated into the SRDYI#/SBRDYI# logic in order to quickly terminate non-dirty writes.
- the state of DW# being high can be used in combinational logic to terminate non-burst dummy write transfers in two clocks and burst dummy transfers in one clock. Because DW# becomes valid early in each T2 bus state, it can easily be incorporated into system logic without being the most critical path.
- controller 70 will invoke another replacement burst write-back cycle through MWB activation.
- WB STB will be
- miss address register 110 miss address register 110
- WB STB WB STB
- SBRDYI# four times to terminate the transfers is allowed. Combinations of SRDYI# and SBRDYI# to terminate the transfers of a quad write are not supported. Once the system has asserted either SRDYI# or SBRDYI# to complete the first transfer, it must assert the same signal three more times to finish the transfer. The controller 70 will assert SBLAST# during the fourth transfer to indicate completion of the quad write.
- Figure 44 details a write-back cycle to memory, where both lines associates with a tag entry contain valid and dirty data.
- SADS# and all system cycle definition signals are driven valid in T1 (with the exception of DW#, which becomes valid in T2).
- SPOE# is asserted in ST2 to enable the system port 112 of the cache memory 72.
- the controller 70 supports 486 CPU systems only with write-back mode.
- Write-through mode is supported on a cycle-to-cycle basis by monitoring the PWT signal on the 486.
- the controller 70 updates the cache memory without updating the system memory if a cache hit occurs.
- Main memory is updated only when a dirty line is replaced.
- Figure 45 shows a cacheable write hit operation.
- the cycle is terminated in two clocks by assertion of RDYO# to the 486 CPU.
- the write data is simply latched into the cache data array by assertion of HPWEx# in T2.
- the associated valid and dirty bits for the double word are set in the controller 70 tag array.
- BLAST# ill always be asserted on the first transfer, as only scalar writes are supported by the 486 CPU.
- the 486 CPU cannot burst write more than 32 bits.
- Cache memory 72 write-back architecture filters out most of the write traffic induced by the 486 CPU internal cache write-through policy. System write cycle latency is eliminated as the performance bottleneck. Furthermore, use of cache memory 72 allows write misses to be followed by zero wait states read/write hits with no idle bus clocks.
- controller 70 will essentially treat write miss cycles as write hit cycles, by writing the data directly into the cache and replacing one line in the cache.
- a one-clock latency is required to first move the data being replaced from the cache data array into the write back register set 118, for later write-back if dirty data was present in the replaced line.
- write miss cycles will occur in three clocks, instead of two.
- a system quad fetch will then occur, in order to bring in the remainder of the 16-byte line.
- This quad fetch can execute concurrently with any later read/write hit cycles driven by the CPU.
- the architecture of the controller 70 guarantees that the local write cycle and the following system quad fetch will in effect be a locked operation (although SLOCK# is not asserted). To ensure this, the write miss will not be completed until the controller 70 owns the system bus (SHLDA deasserted). Write misses when the system bus is granted will be stalled by delay of RDYO/ until SHLDA is deasserted. If the controller 70 owns the system bus when the write miss occurs, the controller 70 will perform the system quad fetch to completion, by not granting SHLDA until the fetch and any subsequent possible write-back cycles are completed.
- write misses are of two types: write line misses and write tag misses:
- a cacheable write line miss will latch the write data into the cache data array, and assert RDYO# in three clocks to the CPU to terminate the cycle on the host side.
- a system quad fetch will be initiated in order to obtain the remaining data of the cache line.
- Incoming data from the system quad fetch will be merged with any dirty data in the line, including the just-written miss data from the CPU. No write-backs will occur from write line misses, as no tags are replaced. The details are as follows:
- Signal MALE is asserted in T2 to inhibit writes directly into the cache data array.
- the data is instead written first into write register 120, and then into the data array.
- a "mask” bit associated with each doubleword of the line inside the cache memory 72 is set. This
- mask bit is used to prevent the data written from the 486 CPU and any other dirty data from being overwritten by the corresponding data which will later be obtained from the pending quad fetch.
- RDYO# is asserted to the 486 CPU to terminate the local CPU cycle. This termination happens during the third clock of the cycle (one wait state).
- the host port of the cache memory 72 can serve any read/write hits with no wait states following any write miss.
- Figure 46 shows an example of a write miss followed directly by a read hit, and the subsequent processing on both busses that occurs.
- a write miss cycle detected by the controller 70 results in the miss address being latched into the cache memory 72 miss address register 110.
- the controller 70 then generates a quad fetch to main memory, at the
- miss address register 110 This quad fetch is "wrapped around" the demand miss word on a 16-byte address boundary. The order will either follow 486-style address order or sequential address order.
- Data returned from the quad fetch is loaded into the cache memory 72 memory update register set 116. Similar to system quad fetches due to read line misses, the DW# signal is asserted from the controller 70 to the cache memory 72's to qualify incoming quad fetch data, so that the write miss data and any other dirty data is not overwritten.
- the controller 70 asserts SBLAST# during the fourth transfer of the quad fetch, to terminal the cycle. At the completion of the system quad fetch, the controller 70 then writes the contents of memory update register set 116 into the cache memory 72 data array by assertion of the QWR (Quad Write) signal. The update entry in the data array is pointed to by the miss address register 110.
- Figure 47 shows a write line miss being terminated with one wait state, and the resulting system quad fetch that occurs.
- Write tag misses are similar to write line misses. However, a tag will be replaced during write tag miss processing, so write tag misses may be followed by write-back cycles. The quad fetch will be followed by one or two write-back cycles of the data for the replaced tag, if either line contained any "valid" or "dirty” data. Writeback cycles due to write tag misses are otherwise
- MALE is asserted by the controller 70 in T2 to prevent the write data from directly overwriting data in the cache array, since this data may be dirty and must then be written back to memory. Instead, the write data is internally latched into the cache memory 72 write register 120.
- controller 70 will select one out of the two selected replace entries if any of the entries are marked "invalid" of 2-way set associative mode has been chosen.
- the LRU bits and an LRU policy will be used to select if all entries of the replace lines are marked "valid".
- the selected line is latched into the cache memory 72 write back register set 118.
- write register 120 is written into the cache memory 72 data array, since any potentially dirty data has been moved to write back register set 118.
- the controller 70 then terminates the write miss cycle through RDYO# assertion. As in the write line miss case, this will occur in three clocks.
- miss address register 110 The write miss address is latched into miss address register 110, and a system quad fetch occurs at the address pointed to by the miss address register 110, with the fetched data loading memory update register set 116. As before, SBLAST# terminates the system quad fetch.
- a subsequent read/write operation from the 486 with a different miss line address will be 'frozen' through delay of RDYO# or BRDYO# until the replacement cycle (if any) of this write miss is completed.
- write-back cycle(s) will occur if any valid and dirty data was replaced. These write-backs will occur as follows:
- Figure 48 shows a write tag miss.
- the line retrieved from the resulting system quad fetch replaces dirty data and a write-back cycle to main memory is generated.
- controller 70 signals (except one) are driven to valid levels in the ST1 state.
- the ST2 state will always follow the ST1 state.
- the controller 70 will remain in the ST2 state until either SRDYI# or SBRDYI# is asserted.
- the DW# output of the controller 70 will be asserted only in ST2 states. DW# will not be valid in ST1 states.
- the controller 70 will re-enter the ST1 state and drive another ADS# signal to the system.
- controller 70 indicates that the next assertion of the ready inputs will terminate this cycle.
- SHLDA latency can be longer than the corresponding hold
- SHOLD will be acknowledged by asserting SHLDA only if no further cycles are required to complete the previous transaction. For example, if a write miss triggered a system quad fetch further followed by two write-back cycles, an SHOLD
- controller 70 will respond to operations other than 486 CPU bus cycles.
- RESET clears the tag arrays (valid and dirty bits) of the controller 70, RESET should not always be tied
- controller 70 directly to the system, as some systems may desire to reset the CPU separatably from the memory array. Unlike write-through caches, the controller 70 must be reset with the memory array in these cases. The controller 70 will be able to respond to SHOLD during RESET, as the 486 recognizes HOLD during RESET. SHOLD/SHLDA OPERATION
- the controller 70 grants and receives the system side bus through the use of SHOLD (System Hold Request) and SHLDA (System Hold Acknowledge).
- SHOLD may be asserted at any time to the controller 70. If the system bus is idle, SHLDA will acknowledge the request by asserted high.
- SHOLD is still active at the next system bus cycle boundary (SBLAST# low and either SRDY# or SBRDY# asserted low), SHLDA will then be asserted high in acknowledgement. As in the 486 CPU, short SHOLD requests (those that appear during a bus cycle bus disappear before the first bus cycle boundary) are ignored.
- the controller 70 will float all system side control, address, and data lines to allow another bus master access to the system bus.
- transceivers are turned around when the controller 70 either grants or receives the bus. This turning is transparent when the controller 70 grants the bus, as it occurs during the idle state that SHLDA is asserted, so systems need not wait to drive addresses when they sample SHLDA high.
- flush operations can be invoked in two ways, one software and one hardware.
- the first method of generating flushes is by software execution of the 486 INVD and WBINVD instructions
- the 486 CPU correspondingly generates special cycles during these instructions, in addition to flushing its internal cache.
- the INVD produces the Flush cycle, while the WBINVD produces the Write-back cycle.
- controller 70 utilizes a write-back architecture, its response to both Flush and Write-back cycles is identical. For both cycles, the controller 70 first copies all lines marked "dirty" back to memory. The duration required to execute the write-back cycles depends on the number of "dirty" lines to be copied back to main memory. The latency for controller 70 to complete this cycle will be much longer than that of the 486 CPU, due to these write-back cycles. Because of this latency, the controller 70 will recognize SHOLD requests during flush operations. SHLDA will be granted at the completion of the write-back cycle(s) corresponding to the tag entry currently being flushed. Next, the controller 70 will clear (flush) all directory valid bits, LRU bits, and dirty bits.
- the hardware method of generating a flush is through the activation of the FLUSH# input signal.
- the controller 70 FLUSH# pin should be connected directly to that of the 486 CPU.
- write-back cycles to memory will be performed when either the FLUSH# pin is asserted or either of the FLUSH or Write-back special cycles occur.
- Flush operations are the second way that write-back cycles can be generated.
- normal read/write tag misses may also produce write-back cycles. For each line of the data cache that contains valid dirty data, the entire line will be written to memory in a quad doubleword Operation. Since there are two lines associated with each tag entry, either zero, one or two quad writes will occur for each tag. If two writes occur for a tag (both lines contain valid and dirty data), the controller 70 will assert MWB at the end of the first write in order to inform the cache memory 72 that the second line for the tag must be written to memory as well.
- Figure 50 details the beginning of a hardware flush operation due to assertion of the FLUSH# pin.
- controller 70 responds by acquiring the bus from the 486 CPU and begins the process of writing back lines which contain valid dirty data to memory.
- Flush Write-back cycles are generally identical to write-backs due to read/write tag misses, described before.
- DW# will be asserted for each doubleword to indicate to the system if the doubleword is dirty or not.
- System logic should use DW# as described in the Read Miss section, in both write enable logic and to hasten assertion of SRDYI#/SBRDYI# for non-dirty dummy writes.
- the controller 70 must be granted the local processor bus, in order to perform the flush operations. To acquire the 486 CPU local bus, the controller 70 will use the HOLD/HLDA protocol for either hardware or software flush operations. If the FLUSH# pin is asserted (hardware flush), the controller 70 will immediately assert HOLD to the CPU. When HLDA is returned by the CPU, the controller 70 will begin the flush operation, by using the local CPU bus to drive out flush addresses (addresses of lines which contain dirty data) and asserting CALE to latch these addresses within the cache memory 72.
- controller 70 will assert HOLD to the CPU, followed by RDYO#. As the CPU recognizes HOLD on bus cycle
- HLDA will be driven at the end of the 486 CPU Flush or Write-back cycle. Having obtained the bus, the controller 70 begins driving out flush addresses and CALE as previously described.
- the system memory controller must accommodate the controller 70 when snoop operations occur on the system bus due to another bus master, whether by DMA or another CPU.
- the controller 70 supports snoops through the SEADS# pin, which is functionally equivalent to the 486 CPU EADS# pin.
- the controller 70 has three new pins not present on the 486 CPU to allow snoop operations to correctly occur: SNPBUSY, SMEMWR#, and SMEMDIS.
- the controller 70 As the controller 70 has no SAHOLD (System Address Hold) pin, the controller 70 system bus must be tri-stated through the SHOLD/SHLDA protocol before the snoop can occur.
- SAHOLD System Address Hold
- the SHOLD/SHLDA method is the simplest and preferred method of allowing DMA to occur.
- the controller 70 will sample SMEMWR/ to determine if a snoop read or write is
- controller 70 will obtain the local CPU address bus through assertion of either AHOLD and/or
- BOFF# This will allow the controller 70 to drive an invalidation cycle to the i486 CPU at the snoop write address.
- EADS# will be asserted to the CPU to trigger this invalidate internally in the CPU.
- Second, SNPBUSY will be asserted high to acknowledge the snoop operation.
- the controller 70 will check its internal array to determine if the snoop write is a hit or a miss. If the address is present and valid in the cache (a hit), the controller 70 will write the data for the snoop write directly into the cache memory 72 RAM array section 100.
- the system may process the write normally. There are two requirements that must be met in order for the controller 70 to correctly process the snoop write:
- the system must supply valid data and byte enables on the controller 70's system wide for all snoop writes. This data and byte enables should be driven valid along with the system address SA ⁇ 27:2>, SMEMWR#, and SEADS# pins.
- SNPBUSY will remain high for some number of clocks, while the snoop operation is continuing.
- the falling edge of SNPBUSY indicates to the system that the controller 70 has completed its internal lookup and is prepared to terminate the snoop write operation.
- the system must return SRDYI# or SBRDYI# after
- SNPBUSY has fallen, in order to terminate the snoop write operation. Since the external system cannot determine if a hit or miss has occurred, SRDYI# or SBRDYI# must be returned for all snoop write operations. If the snoop write has not completed on the system bus when the falling edge of SNPBUSY occurs, the system is free to withhold SRDYI# or SBRDY# from the controller 70 and add wait states to the operation. SRDYI# and SBRDYI# are not sampled by the controller 70 during the snoop write
- the controller 70 detects snoop reads through the combination of SEADS# low and SMEMWR# low. When this happens, SNPBUSY toggles and stays high for a varying number of clocks while the tag lookup occurs. While this tag lookup is in progress, the system memory controller should delay the cycle until one or two cases occurs:
- the requested data is not contained in the cache data array (a snoop read miss), and the system memory must supply the data. This result is indicated by SNPBUSY going low while SMEMDIS has remained low. On the clock that SNPBUSY is sampled low, the system memory controller may determine that it needs to supply the requested data. From the clock in which SEADS# is asserted to the controller 70, latency of performing the snoop read operation will be two clocks for the snoop miss case. A snoop read miss is shown in
- the requested data is contained in the cache RAM
- SMEMDIS System MEMory Disable
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Appareil d'ante-mémoire (12) compatible avec une large variété de types de transfert par bus comprenant des transferts discontinus et continus. En mode continu, un ordre d'accès quadruple bouclé à 'demande initiale de mot' est pris en charge. Le système d'ante-mémoire découple le sous-système de mémoire principal du bus de données central de manière à permettre des opérations de transfert parallèles de présence dans l'ante-mémoire et de mémoire du système afin d'augmenter la vitesse du système et afin de cacher des cycles de réécriture dans la mémoire du système provenant du microprocesseur. Les différences de vitesse des bus locaux et du système sont adaptées, et un chemin de migration facile est ménagé entre des systèmes à microprocesseur en mode discontinu et des systèmes à microprocesseur en mode continu. L'appareil d'ante-mémoire (72) comprend une mémoire à accès sélectif (72A-72D), un point d'accès à l'ordinateur central (HP), ainsi qu'un point d'accès au système (SP). L'appareil d'ante-mémoire (72) comprend également une bascule d'entrée connectée au point d'accès à l'ordinateur central, destinée à écrire sélectivement des données dans la mémoire, ainsi qu'un registre de sortie connecté au point d'accès du système et destiné à recevoir des données provenant de la mémoire ainsi qu'à fournir sélectivement les données au point d'accès à l'ordinateur central ou au point d'accès au système. Dans un mode de réalisation, la bascule d'entrée est un registre d'écriture en mémoire, et le registre de sortie comprend un registre de maintien de lecture ainsi qu'un registre de réécriture.
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US54607190A | 1990-06-27 | 1990-06-27 | |
US546,071 | 1990-06-27 | ||
US67891491A | 1991-04-01 | 1991-04-01 | |
US678,912 | 1991-04-01 | ||
US07/678,912 US5488709A (en) | 1990-06-27 | 1991-04-01 | Cache including decoupling register circuits |
US678,914 | 1991-04-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1992000590A1 true WO1992000590A1 (fr) | 1992-01-09 |
Family
ID=27415469
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1991/004484 WO1992000590A1 (fr) | 1990-06-27 | 1991-06-24 | Ante-memoire a acces selectif |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU8298491A (fr) |
WO (1) | WO1992000590A1 (fr) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0782079A1 (fr) * | 1995-12-18 | 1997-07-02 | Texas Instruments Incorporated | Accès à rafale en systèmes de traitement de données |
EP0598570B1 (fr) * | 1992-11-13 | 2000-01-19 | National Semiconductor Corporation | Microprocesseur comprenant un système de configuration de régions et procédé pour contrôler les opérations du sous-système mémoire par régions d'adresses |
JP2006506014A (ja) * | 2002-11-07 | 2006-02-16 | アダプティクス、インク | マルチキャリヤ通信システム(multi−carriercommunicationsystem)における適応キャリヤ割り当てと電力コントロール方法及び装置 |
JP2010200367A (ja) * | 2010-05-12 | 2010-09-09 | Adaptix Inc | マルチキャリヤ通信システム(multi−carriercommunicationsystem)における適応キャリヤ割り当てと電力コントロール方法及び装置 |
US8738020B2 (en) | 2000-12-15 | 2014-05-27 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US8760992B2 (en) | 2004-12-07 | 2014-06-24 | Adaptix, Inc. | Method and system for switching antenna and channel assignments in broadband wireless networks |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4403288A (en) * | 1981-09-28 | 1983-09-06 | International Business Machines Corporation | Methods and apparatus for resetting peripheral devices addressable as a plurality of logical devices |
US4577293A (en) * | 1984-06-01 | 1986-03-18 | International Business Machines Corporation | Distributed, on-chip cache |
-
1991
- 1991-06-24 WO PCT/US1991/004484 patent/WO1992000590A1/fr unknown
- 1991-06-24 AU AU82984/91A patent/AU8298491A/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4403288A (en) * | 1981-09-28 | 1983-09-06 | International Business Machines Corporation | Methods and apparatus for resetting peripheral devices addressable as a plurality of logical devices |
US4577293A (en) * | 1984-06-01 | 1986-03-18 | International Business Machines Corporation | Distributed, on-chip cache |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0598570B1 (fr) * | 1992-11-13 | 2000-01-19 | National Semiconductor Corporation | Microprocesseur comprenant un système de configuration de régions et procédé pour contrôler les opérations du sous-système mémoire par régions d'adresses |
EP0782079A1 (fr) * | 1995-12-18 | 1997-07-02 | Texas Instruments Incorporated | Accès à rafale en systèmes de traitement de données |
JPH09179780A (ja) * | 1995-12-18 | 1997-07-11 | Texas Instr Inc <Ti> | バースト可でキャッシュ不可のメモリアクセスを支援するマイクロプロセッサ装置 |
US8743717B2 (en) | 2000-12-15 | 2014-06-03 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US9210708B1 (en) | 2000-12-15 | 2015-12-08 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US9344211B2 (en) | 2000-12-15 | 2016-05-17 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US8738020B2 (en) | 2000-12-15 | 2014-05-27 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US8743729B2 (en) | 2000-12-15 | 2014-06-03 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US9191138B2 (en) | 2000-12-15 | 2015-11-17 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US8750238B2 (en) | 2000-12-15 | 2014-06-10 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US9219572B2 (en) | 2000-12-15 | 2015-12-22 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US8767702B2 (en) | 2000-12-15 | 2014-07-01 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US9203553B1 (en) | 2000-12-15 | 2015-12-01 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US8891414B2 (en) | 2000-12-15 | 2014-11-18 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US8934445B2 (en) | 2000-12-15 | 2015-01-13 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US8934375B2 (en) | 2000-12-15 | 2015-01-13 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US8958386B2 (en) | 2000-12-15 | 2015-02-17 | Adaptix, Inc. | Multi-carrier communications with adaptive cluster configuration and switching |
US8964719B2 (en) | 2000-12-15 | 2015-02-24 | Adaptix, Inc. | OFDMA with adaptive subcarrier-cluster configuration and selective loading |
US8005479B2 (en) | 2002-11-07 | 2011-08-23 | Adaptix, Inc. | Method and apparatus for adaptive carrier allocation and power control in multi-carrier communication systems |
JP2006506014A (ja) * | 2002-11-07 | 2006-02-16 | アダプティクス、インク | マルチキャリヤ通信システム(multi−carriercommunicationsystem)における適応キャリヤ割り当てと電力コントロール方法及び装置 |
US8797970B2 (en) | 2004-12-07 | 2014-08-05 | Adaptix, Inc. | Method and system for switching antenna and channel assignments in broadband wireless networks |
US8760992B2 (en) | 2004-12-07 | 2014-06-24 | Adaptix, Inc. | Method and system for switching antenna and channel assignments in broadband wireless networks |
JP2010200367A (ja) * | 2010-05-12 | 2010-09-09 | Adaptix Inc | マルチキャリヤ通信システム(multi−carriercommunicationsystem)における適応キャリヤ割り当てと電力コントロール方法及び装置 |
Also Published As
Publication number | Publication date |
---|---|
AU8298491A (en) | 1992-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5488709A (en) | Cache including decoupling register circuits | |
US5732241A (en) | Random access cache memory controller and system | |
US5745732A (en) | Computer system including system controller with a write buffer and plural read buffers for decoupled busses | |
JP3067112B2 (ja) | 遅延プッシュをコピー・バック・データ・キャッシュに再ロードする方法 | |
US5469555A (en) | Adaptive write-back method and apparatus wherein the cache system operates in a combination of write-back and write-through modes for a cache-based microprocessor system | |
US5822767A (en) | Method and apparartus for sharing a signal line between agents | |
US5355467A (en) | Second level cache controller unit and system | |
JP3218317B2 (ja) | 集積キャッシュユニットおよびその構成方法 | |
US5210845A (en) | Controller for two-way set associative cache | |
US5787486A (en) | Bus protocol for locked cycle cache hit | |
JP3158161B2 (ja) | 集積キャッシュユニットおよび集積キャッシュユニットにおいてインターロック変数をキャッシュする方法 | |
US5715428A (en) | Apparatus for maintaining multilevel cache hierarchy coherency in a multiprocessor computer system | |
US5627992A (en) | Organization of an integrated cache unit for flexible usage in supporting microprocessor operations | |
US5903911A (en) | Cache-based computer system employing memory control circuit and method for write allocation and data prefetch | |
US5778433A (en) | Computer system including a first level write-back cache and a second level cache | |
JP2849327B2 (ja) | システム管理モードの実行中に書込み保護状態を無効にするコンピュータシステムの操作方法及びコンピュータシステム | |
US20240320154A1 (en) | Multi-level cache security | |
GB2296353A (en) | Cache memory system with reduced request-blocking | |
JPH08263373A (ja) | キャッシュにおけるスヌーピング装置および方法 | |
US5822756A (en) | Microprocessor cache memory way prediction based on the way of a previous memory read | |
US5724550A (en) | Using an address pin as a snoop invalidate signal during snoop cycles | |
US5717894A (en) | Method and apparatus for reducing write cycle wait states in a non-zero wait state cache system | |
JP3218316B2 (ja) | 集積キャッシュユニットおよびその内部でキャッシュ機能を実現するための方法 | |
US5860113A (en) | System for using a dirty bit with a cache memory | |
EP0309995B1 (fr) | Système pour la sélection rapide de champs d'adresses non antémémorisables utilisant un réseau logique programmable |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AT AU BB BG BR CA CH CS DE DK ES FI GB HU JP KP KR LK LU MC MG MN MW NL NO PL RO SD SE SU |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE BF BJ CF CG CH CI CM DE DK ES FR GA GB GN GR IT LU ML MR NL SE SN TD TG |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
NENP | Non-entry into the national phase |
Ref country code: CA |