+

WO1996037844A1 - Microprocesseur pipeline effectuant des requetes en memoire dans une memoire cache et dans un controleur de memoire exterieure durant un meme cycle d'horloge - Google Patents

Microprocesseur pipeline effectuant des requetes en memoire dans une memoire cache et dans un controleur de memoire exterieure durant un meme cycle d'horloge Download PDF

Info

Publication number
WO1996037844A1
WO1996037844A1 PCT/US1996/007091 US9607091W WO9637844A1 WO 1996037844 A1 WO1996037844 A1 WO 1996037844A1 US 9607091 W US9607091 W US 9607091W WO 9637844 A1 WO9637844 A1 WO 9637844A1
Authority
WO
WIPO (PCT)
Prior art keywords
request
memory
address
bus
predetermined time
Prior art date
Application number
PCT/US1996/007091
Other languages
English (en)
Inventor
Robert Divivier
Mario D. Nemirovsky
Robert Bignell
Original Assignee
National Semiconductor Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Semiconductor Corporation filed Critical National Semiconductor Corporation
Priority to KR1019970700548A priority Critical patent/KR970705086A/ko
Priority to EP96920253A priority patent/EP0772829A1/fr
Publication of WO1996037844A1 publication Critical patent/WO1996037844A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0884Parallel mode, e.g. in parallel with main memory or CPU
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems

Definitions

  • the present invention relates to pipelined microprocessors and, more particularly, to a pipelined microprocessor that makes memory requests to a cache memory and an external memory controller during the same clock cycle when the system bus that connects an external memory to the processor is available.
  • a pipelined microprocessor is a microprocessor that operates on instructions in stages so that, at each stage of the pipeline, a different function is performed on an instruction. As a result, multiple instructions move through the pipe at the same time much like to-be-assembled products move through a multistage assembly line.
  • FIG. 1 shows a block diagram that illustrates the flow of an instruction through a conventional pipelined processor.
  • the first stage in the pipe is a prefetch stage.
  • the to-be-executed instructions are retrieved from either a cache memory or an external memory, and are then sequentially loaded into a prefetch buffer.
  • the purpose of the prefetch stage is to fill the prefetch buffer so that one instruction can be advanced to the decode stage, the next stage in the pipe, with each clock cycle.
  • each instruction moving through the pipe is decoded to determine what operation is to be performed.
  • an operand stage determines if data will be needed to perform the operation and, if needed, retrieves the data from either one of several data registers, a data cache memory, or the external memory.
  • the operation specified by the instruction is performed in an execution stage, while the results of the operation are stored in a write-back stage.
  • each instruction is advanced from one stage to the next with each successive clock cycle.
  • the processor appears to complete the execution of each instruction in only one clock cycle.
  • the branch instruction With the branch instruction, the next instruction to be executed is determined by the outcome of the operation.
  • the pipeline continues to function as described above. However, if the outcome of the operation requires an instruction not currently in the pipe, then the instructions in the prefetch, decode, and operand stages must be flushed from the pipeline and replaced by the alternate instructions required by the branch condition.
  • DRAM dynamic random-access-controller
  • One problem with this approach is that when the needed instruction is not stored in the cache memory, the processor must waste at least one clock cycle to establish this fact.
  • One solution to this problem is to simply increase the size of the cache memory, thereby increasing the likelihood that the needed instruction will be stored in the cache memory.
  • a memory request is first made to a cache memory and then, if the request is not stored in the cache memory, to an external memory.
  • several clock cycles can be lost by first accessing the cache memory rather than the external memory.
  • these lost clock cycles are eliminated by utilizing a dual memory access circuit that makes a memory request to both the cache memory and an external memory controller during the same clock cycle when the bus that connects the external memory to the processor is available.
  • the cycle time lost when the request is not stored in the cache memory can be eliminated.
  • the unneeded external memory requests that result each time the request is stored in the cache memory are eliminated by gating the request to the external memory controller with a logic signal output by the cache memory that indicates whether the request is present.
  • a dual memory access circuit in accordance with the present invention includes a memory access controller that monitors the availability of a system bus, and that outputs a first request in response to needed information when the system bus is unavailable. On the other hand, when the system bus is available, the memory access controller outputs the first request and a second request in response to the needed information. In addition, when both the first and second requests are output, the memory access controller also asserts a dual request signal.
  • the dual memory access circuit further includes a cache controller that determines whether the needed information is stored in the cache memory and, if the needed information is found and the dual request signal is asserted, asserts a terminate request signal. An external memory controller gates the second request and the terminate request signal so that when the terminate request signal is asserted, the second request is gated out, and so that when the terminate request signal is deasserted, the second request signal is latched by the external memory controller.
  • FIG. 1 is a block diagram illustrating the flow of an instruction through a conventional pipelined processor.
  • FIG. 2 is a block diagram illustrating a dual memory access circuit 100 in accordance with the present invention.
  • FIG. 3 is a block diagram illustrating the operation of cache memory 120 and cache controller 130.
  • FIG. 4 is a timing diagram illustrating the dual memory request.
  • FIG. 5 is a block diagram illustrating the operation of DRAM controller 140.
  • DETAILED DESCRIPTION FIG. 2 shows a block diagram of a dual memory access circuit 100 in accordance with the present invention.
  • circuit 100 includes a memory access controller 110 that controls memory requests to a cache memory 120 and an external memory to obtain instructions requested by the prefetch and execution stages of a pipelined microprocessor. Both cache memory 120 and the external memory, in turn, supply the requested instructions back to the prefetch stage of the microprocessor.
  • an eight-bit word is output by cache memory 120 whereas a 16-bit word is output by the external memory.
  • a prefetch buffer stores the next sequence of instructions to be decoded.
  • Memory access controller 110 monitors the status of the prefetch buffer and, when the prefetch buffer is less than full, requests the next instruction in the sequence from memory.
  • memory access controller 110 determines which memory the request will be directed to in response to the status of the system bus that connects the external memory to the processor.
  • memory access controller 110 requests the next instruction from cache memory 120 by outputting a requested address RADl, as defined by a prefetch instruction pointer, to a cache controller 130.
  • Cache controller 130 determines whether the requested address RADl is stored in cache memory 120 by comparing the requested address RADl to the address tags stored in cache memory 120.
  • FIG. 3 shows a block diagram that illustrates the operation of cache memory 120 and cache controller 130. As shown in FIG.
  • cache memory 120 which is configured as a direct-mapped memory, includes a data RAM 122 that stores eight bytes of data in each line of memory (the number of bytes per line is arbitrary), and a tag RAM 124 that stores tag fields which identify the page of memory to which each line of memory corresponds.
  • a valid bit RAM 126 indicates whether the information stored in data RAM 122 is valid.
  • the requested address RADl output from memory access controller 110 includes a page field that identifies the page of memory, a line field that identifies the line of the page, and a "don't care" byte position field.
  • cache controller 130 utilizes the line field of the requested address RADl to identify the corresponding page field stored in tag RAM 124.
  • the page field from tag RAM 124 is then compared to the page field of the requested address RADl to determine if the page fields match. If the page fields match and the corresponding bit in valid bit RAM 125 indicates that the information is valid, cache controller 130 asserts a cache hit signal CHS, which indicates a match, while cache memory 120 outputs the instruction associated with the requested address RADl. If, on the other hand, the page fields do not match, thereby indicating that the requested address RADl is not stored in cache memory 120, then the instruction associated with the requested address RADl must be retrieved from the external memory.
  • memory access controller 110 requests the next instruction from both cache memory 120 and the external memory during the same clock cycle. (In the FIG. 2 embodiment, the request is also output to an external bus controller which controls, among other things, the read-only-memory (ROM) and the disk drives). Controller 110 initiates the requests by outputting the requested address RADl to cache controller 130 while also outputting a requested address RAD2 to a dynamic random access memory (DRAM) controller 140. In most cases, the requested addresses RADl and RAD2 will be identical but may differ as required by the addressing scheme being utilized.
  • DRAM dynamic random access memory
  • FIG. 4 shows a timing diagram that illustrates the dual memory request.
  • memory access controller 110 also asserts a dual address signal DAS which indicates that requests are going to both memories, and a bus request signal BRS which indicates that the requested address RAD2 is valid.
  • DAS dual address signal
  • BRS bus request signal
  • cache controller 130 when cache controller 130 matches the requested address RADl with one of the address tags while the dual address signal DAS is asserted, cache controller 130 asserts a terminate address signal TAS at approximately the same time that the cache hit signal CHS is asserted. (See FIG. 3).
  • DRAM controller 140 gates the bus request signal BRS with the terminate address signal TAS to determine whether the external memory request should continue.
  • FIG. 5 shows a block diagram that illustrates the operation of DRAM controller 140. As shown in FIG. 5, when the terminate address signal TAS is asserted, the bus request signal
  • BRS is gated out, thereby terminating the memory request made to the external memory.
  • the terminate address signal TAS is deasserted
  • the second requested address RAD2 is latched by DRAM controller 140 on the falling edge of the same system clock cycle that initiated the requested addresses RADl and RAD2.
  • Cache controller 130 deasserts the cache hit signal CHS and the terminate address signal TAS when cache controller 130 determines that the requested address RADl does not match any of the address tags, i.e., the page fields do not match.
  • the present invention also follows the same approach with respect to instructions requested by the execution stage.
  • the outcome of a branch instruction frequently calls for an instruction which is not in the pipeline.
  • memory access controller 110 must request the instruction from either cache memory 120 or the external memory.
  • memory access controller 110 when the system bus is available, memory access controller 110 asserts the dual address signal DAS and the bus request signal BRS, and outputs the requested addresses RADl and RAD2 to cache controller 130 and DRAM controller 140, respectively. On the other hand, if the system bus is unavailable, memory access controller 1 10 only outputs the requested address RADl to cache controller 130.
  • the advantage of simultaneously requesting the next instruction from both cache controller 130 and DRAM controller 140 when the bus is available is that each time the next instruction is absent from cache memory 120, the present invention saves the wasted clock cycle that is required to first check cache memory 120. This, in turn, improves the performance of the processor. If, on the other hand, the system bus is unavailable, memory access controller 110 must wait at least one clock cycle to access the system bus in any case. Thus, no cycle time can be saved when the bus is unavailable. As a result, controller 110 only outputs the requested address RADl to cache controller 130 when the bus is unavailable.
  • Terminating the bus request signal BRS when cache controller 130 determines that the page fields match also has the added advantage of decreasing the bus utilization by the processor. If the bus request signal BRS was not terminated each time the page fields match, the system bus would be tied up servicing an unneeded request, thereby limiting the utilization of the bus. In addition, the logic within the memory access controller 110 would have to be configured to wait for the unneeded instruction to return from the DRAM. Further, terminating the bus request signal BRS when cache controller 130 determines that the page fields match has the additional advantage of maintaining the page hit registers stored in DRAM controller 140. When a DRAM is accessed by DRAM controller 140, the address is divided into a row address which defines a page of memory, and a column address which defines the individual bytes stored in the page.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

Cette invention concerne des requêtes en mémoire effectuées dans une mémoire cache et dans un contrôleur de mémoire extérieure durant un même cycle d'horloge lorsque le bus connecté à la mémoire extérieure est disponible. En effectuant les deux requêtes en mémoire durant un même cycle d'horloge, plutôt que d'accéder dans un premier temps à la mémoire cache comme cela se fait d'habitude, il est ainsi possible d'éliminer la durée du cycle perdue lorsque la requête n'est pas stockée dans la mémoire cache. Les requêtes inutiles en mémoire externe, qui se produisent chaque fois que la requête est stockée en mémoire cache, sont éliminées par portillonnage de la requête vers l'unité de commande de mémoire extérieure un signal de sortie logique émis par la mémoire cache indiquant si la requête est présente ou non.
PCT/US1996/007091 1995-05-26 1996-05-16 Microprocesseur pipeline effectuant des requetes en memoire dans une memoire cache et dans un controleur de memoire exterieure durant un meme cycle d'horloge WO1996037844A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1019970700548A KR970705086A (ko) 1995-05-26 1996-05-16 같은 클락 사이클 동안에 캐쉬 메모리와 외부 메모리 제어기로 메모리 요청을 하는 파이프라인 마이크로프로세서(A Pipelined Microprocessor that Makes Memory Requests to a Cache Memory and an external Memory Controller During the Same Clock Cycle)
EP96920253A EP0772829A1 (fr) 1995-05-26 1996-05-16 Microprocesseur pipeline effectuant des requetes en memoire dans une memoire cache et dans un controleur de memoire exterieure durant un meme cycle d'horloge

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US45230695A 1995-05-26 1995-05-26
US08/452,306 1995-05-26

Publications (1)

Publication Number Publication Date
WO1996037844A1 true WO1996037844A1 (fr) 1996-11-28

Family

ID=23795977

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1996/007091 WO1996037844A1 (fr) 1995-05-26 1996-05-16 Microprocesseur pipeline effectuant des requetes en memoire dans une memoire cache et dans un controleur de memoire exterieure durant un meme cycle d'horloge

Country Status (3)

Country Link
EP (1) EP0772829A1 (fr)
KR (1) KR970705086A (fr)
WO (1) WO1996037844A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015061337A1 (fr) * 2013-10-21 2015-04-30 Sehat Sutardja Système de mémoire cache de niveau final et procédé correspondant
US9454991B2 (en) 2013-10-21 2016-09-27 Marvell World Trade Ltd. Caching systems and methods for hard disk drives and hybrid drives
US9559722B1 (en) 2013-10-21 2017-01-31 Marvell International Ltd. Network devices and methods of generating low-density parity-check codes and performing corresponding encoding of data
US10067687B2 (en) 2014-05-02 2018-09-04 Marvell World Trade Ltd. Method and apparatus for storing data in a storage system that includes a final level cache (FLC)
US10097204B1 (en) 2014-04-21 2018-10-09 Marvell International Ltd. Low-density parity-check codes for WiFi networks
US11556469B2 (en) 2018-06-18 2023-01-17 FLC Technology Group, Inc. Method and apparatus for using a storage system as main memory
US11822474B2 (en) 2013-10-21 2023-11-21 Flc Global, Ltd Storage system and method for accessing same

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19990057839A (ko) * 1997-12-30 1999-07-15 김영환 캐쉬 미스 시 처리 방법

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0461925A2 (fr) * 1990-06-15 1991-12-18 Compaq Computer Corporation Antémémoire à consultation parallèle
EP0468786A2 (fr) * 1990-07-27 1992-01-29 Dell Usa L.P. Processeur qui réalise des accès mémoire en parallèle à des accès en antémémoire et procédé utilisé
EP0690387A1 (fr) * 1994-06-30 1996-01-03 Digital Equipment Corporation Arbitrage anticipé

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0461925A2 (fr) * 1990-06-15 1991-12-18 Compaq Computer Corporation Antémémoire à consultation parallèle
EP0468786A2 (fr) * 1990-07-27 1992-01-29 Dell Usa L.P. Processeur qui réalise des accès mémoire en parallèle à des accès en antémémoire et procédé utilisé
EP0690387A1 (fr) * 1994-06-30 1996-01-03 Digital Equipment Corporation Arbitrage anticipé

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9928172B2 (en) 2013-10-21 2018-03-27 Marvell World Trade Ltd. Method and apparatus for accessing data stored in a storage system that includes both a final level of cache and a main memory
US11360894B2 (en) 2013-10-21 2022-06-14 Flc Global, Ltd. Storage system and method for accessing same
US9323688B2 (en) 2013-10-21 2016-04-26 Marvell World Trade Ltd. Method and apparatus for accessing data stored in a storage system that includes both a final level of cache and a main memory
US9454991B2 (en) 2013-10-21 2016-09-27 Marvell World Trade Ltd. Caching systems and methods for hard disk drives and hybrid drives
US9477611B2 (en) 2013-10-21 2016-10-25 Marvell World Trade Ltd. Final level cache system and corresponding methods
US9559722B1 (en) 2013-10-21 2017-01-31 Marvell International Ltd. Network devices and methods of generating low-density parity-check codes and performing corresponding encoding of data
US9594693B2 (en) 2013-10-21 2017-03-14 Marvell World Trade Ltd. Method and apparatus for accessing data stored in a storage system that includes both a final level of cache and a main memory
US9733841B2 (en) 2013-10-21 2017-08-15 Marvell World Trade Ltd. Caching systems and methods for hard disk drives and hybrid drives
US9182915B2 (en) 2013-10-21 2015-11-10 Marvell World Trade Ltd. Method and apparatus for accessing data stored in a storage system that includes both a final level of cache and a main memory
US11822474B2 (en) 2013-10-21 2023-11-21 Flc Global, Ltd Storage system and method for accessing same
WO2015061337A1 (fr) * 2013-10-21 2015-04-30 Sehat Sutardja Système de mémoire cache de niveau final et procédé correspondant
US10684949B2 (en) 2013-10-21 2020-06-16 Flc Global, Ltd. Method and apparatus for accessing data stored in a storage system that includes both a final level of cache and a main memory
US10097204B1 (en) 2014-04-21 2018-10-09 Marvell International Ltd. Low-density parity-check codes for WiFi networks
US10761737B2 (en) 2014-05-02 2020-09-01 Marvell Asia Pte, Ltd. Method and apparatus for caching data in an solid state disk (SSD) of a hybrid drive that includes the SSD and a hard disk drive (HDD)
US10067687B2 (en) 2014-05-02 2018-09-04 Marvell World Trade Ltd. Method and apparatus for storing data in a storage system that includes a final level cache (FLC)
US11556469B2 (en) 2018-06-18 2023-01-17 FLC Technology Group, Inc. Method and apparatus for using a storage system as main memory
US11880305B2 (en) 2018-06-18 2024-01-23 FLC Technology Group, Inc. Method and apparatus for using a storage system as main memory

Also Published As

Publication number Publication date
EP0772829A1 (fr) 1997-05-14
KR970705086A (ko) 1997-09-06

Similar Documents

Publication Publication Date Title
US5680564A (en) Pipelined processor with two tier prefetch buffer structure and method with bypass
US5857094A (en) In-circuit emulator for emulating native clustruction execution of a microprocessor
JP4859616B2 (ja) 縮小命令セット・コンピュータ・マイクロプロセッサーの構造
US6401192B1 (en) Apparatus for software initiated prefetch and method therefor
EP0345325B1 (fr) Systeme de memoire
US4888679A (en) Method and apparatus using a cache and main memory for both vector processing and scalar processing by prefetching cache blocks including vector data elements
CA1332248C (fr) Interface commandee par un processeur et executant des instructions de facon continue
US5752269A (en) Pipelined microprocessor that pipelines memory requests to an external memory
US5860105A (en) NDIRTY cache line lookahead
US5651130A (en) Memory controller that dynamically predicts page misses
US5809514A (en) Microprocessor burst mode data transfer ordering circuitry and method
US6823430B2 (en) Directoryless L0 cache for stall reduction
EP0772829A1 (fr) Microprocesseur pipeline effectuant des requetes en memoire dans une memoire cache et dans un controleur de memoire exterieure durant un meme cycle d'horloge
JPH0830454A (ja) 非逐次アクセスの実効待ち時間が短いパイプライン・キャッシュシステム
US5898815A (en) I/O bus interface recovery counter dependent upon minimum bus clocks to prevent overrun and ratio of execution core clock frequency to system bus clock frequency
US20030196072A1 (en) Digital signal processor architecture for high computation speed
US5546353A (en) Partitioned decode circuit for low power operation
US5659712A (en) Pipelined microprocessor that prevents the cache from being read when the contents of the cache are invalid
JPH08263371A (ja) キャッシュにおいてコピーバック・アドレスを生成する装置および方法
US4620277A (en) Multimaster CPU system with early memory addressing
JPH1083343A (ja) メモリをアクセスする方法
US5717891A (en) Digital signal processor with caching of instructions that produce a memory conflict
US5649147A (en) Circuit for designating instruction pointers for use by a processor decoder
KR20000003930A (ko) 명령어 캐시 미스 시 손실을 줄이기 위한 명령어 페치 장치
EP0771442A1 (fr) Controle des limites d'une memoire d'instruction dans un microprocesseur

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): DE KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

WWE Wipo information: entry into national phase

Ref document number: 1019970700548

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 1996920253

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 1996920253

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1019970700548

Country of ref document: KR

WWW Wipo information: withdrawn in national office

Ref document number: 1996920253

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1019970700548

Country of ref document: KR

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载