+

WO2004068361A1 - Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache - Google Patents

Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache Download PDF

Info

Publication number
WO2004068361A1
WO2004068361A1 PCT/JP2003/000723 JP0300723W WO2004068361A1 WO 2004068361 A1 WO2004068361 A1 WO 2004068361A1 JP 0300723 W JP0300723 W JP 0300723W WO 2004068361 A1 WO2004068361 A1 WO 2004068361A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
thread
data
cache line
consistency
Prior art date
Application number
PCT/JP2003/000723
Other languages
English (en)
Japanese (ja)
Inventor
Iwao Yamazaki
Original Assignee
Fujitsu Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Limited filed Critical Fujitsu Limited
Priority to JP2004567505A priority Critical patent/JP4180569B2/ja
Priority to PCT/JP2003/000723 priority patent/WO2004068361A1/fr
Publication of WO2004068361A1 publication Critical patent/WO2004068361A1/fr
Priority to US11/123,140 priority patent/US20050210204A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0842Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies

Definitions

  • Storage control device data cache control device, central processing unit, storage device control method, data cache control method, and cache control method
  • the present invention relates to a memory access issued from a plurality of threads executed simultaneously.
  • Storage controller data cache controller, central processor that processes requests
  • the present invention relates to a storage control method, a data cache control method, and a cache control method, and in particular, to a storage control device and a data cache control which can guarantee consistency in the execution order of reading and writing of shared data between threads.
  • the present invention relates to an apparatus, a central processing unit, a storage device control method, a data cache control method, and a cache control method.
  • Out-of-order processing means that while reading data of an instruction is delayed due to a cache miss, etc., the data of the next instruction is read first, and then the This is the process of reading the instruction data.
  • TSO Total Store Order
  • FIG. 9 is an explanatory diagram for explaining a TSO violation in a multiprocessor and its monitoring principle. (The same figure) is an application that may cause TSO violation.
  • Fig. (B) shows an example of TS ⁇ violation
  • Fig. (C) shows the principle of monitoring TSO violation.
  • Figure (a) shows an example in which the CPU writes the data measured by the measuring instrument to the shared storage area, reads and analyzes the data written to the shared storage area by the CPU, and outputs the analysis result.
  • 3 writes the measurement data to shared storage area ⁇ (ST-—: The data of ⁇ changes from b to b '), and writes that the measurement data has been written to shared storage area A ( ST—A: The data of A changes from a to a ').
  • a fetch request from the instruction processing unit is received at the fetch-port of the storage control unit, and as shown in FIG. 9 (c), each fetch port is assigned to the address of the fetch request.
  • the flag holds the PSTV (Post STatus Valid) flag, RIM (Re-If etch by move in Fetch) flag, and RIF (Re-If etch by move in Fet ch) flag.
  • an F P-TOQ Fetch Port Top of Queue
  • the PS TV flag of the fetch 'port that has received the request of the FC-B is set.
  • the hatched portion indicates a state in which the flag is set.
  • the ST-B of the CPU invalidates or flushes the cache line used by FC-B.
  • the PS TV of the fetch port containing the request of FC-B is set, and a request for invalidating or flushing the physical address portion of the address held by the fetch port and the cache line is received. Since the physical address of the address matches, it is possible to detect that the cache line of the fetch-port that sent the fetch data was taken out.
  • CPU-3 executes ST-B and ST-A, and CPU-j3 receives a cache line containing A from CPU-j3 to execute FC-A. Detects external receipt and sets the RIF flag for all valid fetch-ports. And command processing for the success of FC-A When notifying the device, the RIM flag and the RIF flag of the fetch port holding the request of FC-A are checked, and both flags are set. Request re-execution.
  • the fact that both the RIM flag and the RIF flag are set means that the other instruction processing unit rewrites the data b that has been transmitted in response to the subsequent fetch request B to b ′, and the preceding fetch request A Indicates the possible raw data that received the rewritten data a and.
  • a PS TV flag, a RIM flag, and a RIF flag are provided at each fetch port, and by monitoring the transfer of cache lines between processors, a TSO violation between processors can be prevented. Can be prevented.
  • a TSO guarantee technique in a multiprocessor is disclosed in, for example, US Pat. No. 5,699,538.
  • Techniques related to cache memories are disclosed in JP-A-10-116192, JP-A-10-232839, JP-A-2000-259498, and JP-A-2001-195301.
  • the multi-thread method is a method in which one processor executes a plurality of threads (instruction strings) simultaneously.
  • the primary cache is shared between different threads, and it monitors not only the transfer of cache lines between processors, but also the transfer between threads of the same cache.
  • the present invention has been made to solve the above-described problem of the related art, and has a storage control device and a data storage device that can guarantee consistent execution order of reading and writing of shared data between threads. It is an object of the present invention to provide a cache control device, a central processing unit, a storage device control method, a data cache control method, and a cache control method. Disclosure of the invention
  • the present invention is directed to a storage control device that is shared by a plurality of threads that are executed simultaneously and that processes a memory access request issued from the threads.
  • a consistency assurance means for assuring the consistency of the execution order of reading and writing between the plurality of instruction processing devices with respect to data shared between the instruction processing devices, and an address specified by the memory access request.
  • a thread judging unit for judging whether or not the thread that registered the stored data and the thread that issued the memory access request are the same when the data is stored; and a judgment result of the thread judging unit.
  • a consistency assurance operation starting means for operating the consistency assurance means based on the information.
  • the present invention relates to a storage device control method for processing a memory access request issued from a plurality of threads executed at the same time, and stores data at an address specified by the memory access request.
  • a thread judging step of judging whether or not the thread at which the stored data is registered is the same as the thread at which the memory access request is issued; and a plurality of instruction processing based on the judgment result of the thread judging step.
  • a consistency assurance operation initiating step of operating a consistency assurance mechanism that guarantees consistency of the execution order of reading and writing between the plurality of instruction processing devices with respect to data shared between the devices. It is characterized by the following.
  • the thread that registered the stored data and the thread that issued the memory access request are the same.
  • operating a consistency assurance mechanism that guarantees consistency of the execution order of reading and writing between the plurality of instruction processing devices with respect to data shared among the plurality of instruction processing devices based on the determination result. Therefore, it is possible to guarantee the consistency of the execution order of reading and writing of the shared data between the threads.
  • a data cache control device for processing a memory access request issued from a plurality of instruction processing devices comprising: a data cache control device for processing a memory access request issued from the plurality of instruction processing devices; A consistency assurance means for performing assurance, and when storing a cache line including data of an address specified in the memory access request, a thread in which the stored cache line is registered and the memory access request.
  • the present invention also relates to a data cache control method for processing a memory access request issued from a plurality of threads executed simultaneously, and stores a cache line including data of an address specified by the memory access request.
  • a thread determination step for determining whether or not the force is the same, and the thread determination step is not the same.
  • a consistency assurance operation that activates a consistency assurance mechanism that guarantees consistency of the execution order of reading and writing between the plurality of instruction processing devices with respect to data shared among the plurality of instruction processing devices.
  • a moving step is a moving step.
  • the thread that registered the stored cache line and the thread that issued the memory access request are the same. Is determined to be the same or not, and when it is determined that they are not the same, consistency that guarantees the consistency of the execution order of reading and writing between multiple instruction processing units for data shared between multiple instruction processing units Since the guarantee mechanism is operated, it is possible to guarantee the consistency of the execution order of reading and writing of shared data between the threads.
  • the present invention also provides an instruction processing device that executes a plurality of threads simultaneously.
  • Primary data A central processing unit having a plurality of sets with a cache device, and having a secondary cache device shared by the plurality of sets of primary data cache devices, wherein each of the plurality of sets has a primary data cache.
  • the apparatus comprises: a consistency assurance means for guaranteeing consistency of the execution order of reading and writing between a plurality of instruction processing units for a cache line shared with another set of primary data cache units; and When a cache line having the same physical address as the memory access request is registered by a different thread, a fetch requesting means for making a request to fetch the cache line to the secondary cache device; and the secondary cache.
  • the cache line is invalidated or discharged based on a request from the device, and the A discharge execution means for operating the property assurance means, wherein the secondary cache device is provided when the cache line which has received the request for taking in the cache line is registered in the primary data cache device by another thread.
  • the secondary cache device is provided when the cache line which has received the request for taking in the cache line is registered in the primary data cache device by another thread.
  • the present invention also provides a central processing unit having a plurality of sets of an instruction processing device and a primary data cache device for simultaneously executing a plurality of threads, and having a secondary cache device shared by the plurality of sets of the primary data cache devices.
  • a cache control method used in the above when the primary data cache device is registered by a different thread when a cache line whose physical address matches a memory access request from the instruction processing device is registered by a different thread.
  • a fetch requesting step of making a fetch request for the cache line to the cache device; and the secondary cache device registers the cache line, which has received the cache line fetch request, in the primary data cache device by another thread.
  • the primary data cache device transmits the cache line to the secondary cache device when the cache line whose physical address matches the memory access request from the instruction processing device is registered by different threads.
  • a fetch request is issued, and the secondary cache device invalidates or flushes the cache line if the cache line requested to fetch the cache line is registered in the primary data cache device by another thread.
  • Request to the primary data cache device, and the primary data cache device invalidates or flushes the cache line based on the request from the secondary cache device, thereby sharing it with other * a primary data cache devices.
  • Multiple for cache line The operation of the consistency assurance mechanism that guarantees the consistency of the read and write execution order among the instruction processing units ensures the consistency of the read and write execution order of shared data between threads. can do.
  • the present invention relates to a storage controller shared by a plurality of threads executed at the same time, and for processing a memory access request issued from the thread, wherein a thread executed by the instruction processor is switched.
  • Access invalidating means for invalidating all uncommitted store instructions and fetch instructions among the store instructions and fetch instructions issued by the thread whose execution is interrupted, and the thread whose execution is interrupted.
  • Interlock means for detecting a fetch instruction affected by the execution result of the committed store instruction when execution of the stored instruction is resumed, and controlling the detected fetch instruction to be executed after the execution of the store instruction; It is characterized by having.
  • the present invention is a storage device control method for processing a memory access request issued from a plurality of threads that are executed simultaneously, wherein the execution is performed when a thread executed by the instruction processing device is switched.
  • the thread issued by the suspended thread An access invalidating step of invalidating all uncommitted store instructions and fetch instructions among the store instructions and the fetch instruction; and, when the execution of the thread whose execution has been interrupted is resumed, An interlock step of detecting a fetch instruction affected by an execution result of a committed store instruction and controlling the detected fetch instruction to be executed after the execution of the store instruction.
  • FIG. 1 is a functional block diagram showing a configuration of a CPU according to the first embodiment
  • FIG. 2 is a diagram showing an example of a cache tag
  • FIG. 3 is a key diagram shown in FIG.
  • FIG. 4 is a flowchart showing the processing procedure of the cache control unit.
  • FIG. 4 is a flowchart showing the processing procedure of the MI processing between the cache control unit and the secondary cache unit.
  • FIG. 6 is a functional block diagram showing a configuration of a CPU according to Embodiment 2
  • FIG. 6 is an explanatory diagram for explaining an operation of a cache control unit according to Embodiment 2
  • FIG. 8 is a flowchart showing a processing procedure of a cache control unit according to Embodiment 2
  • FIG. 8 is a flowchart showing a processing procedure of MOR processing
  • FIG. 9 is a TSO violation in a multiprocessor and monitoring thereof Theory to explain the principle
  • FIG. 1 is a functional block diagram showing a configuration of a CPU according to the first embodiment.
  • the CPU 10 has processor cores 100 and 200 and a secondary cache unit 300, and the secondary cache unit 300 has a processor core 100 And shared by 200.
  • the CPU 10 has two processor cores
  • the CPU 10 has only one processor core, or has more processor cores.
  • the processor core 100 will be described here as an example.
  • the processor core 100 has an instruction unit 110, an operation unit 120, a primary instruction cache unit 130, and a primary data cache unit 140.
  • the instruction cut 110 is a processing unit that decodes and executes instructions.
  • An MT (Multi Thread) control unit controls two threads “thread 0” and “thread 1” and executes them simultaneously. I do.
  • the arithmetic unit 120 is a processing unit that includes a general-purpose register, a floating-point register, a fixed-point arithmetic unit, a floating-point arithmetic unit, and executes fixed-point arithmetic and floating-point arithmetic.
  • Reference numeral 0 denotes a storage unit that stores a part of the main storage device for accessing the instructions and data stored in the main storage device at high speed.
  • the secondary cache unit 300 stores more main storage unit instructions and data to compensate for the lack of capacity of the primary instruction cache unit 130 and the primary data cache unit 140. And is connected to the main storage device via the system controller.
  • the primary data cache 144 has a cache memory 141 and a cache control unit 142, and the cache memory 141 is a storage unit for storing data.
  • the cache control unit 144 is a processing unit that manages data stored in the cache memory 141, and includes a TLB (Translation Look-aside Buffer) 144, a TAG unit 144, and a TAG-MA. It has a TCH detection unit 145, a MIB (Move In Buffer) 146, a MOZB I processing unit 147, and a fetch 'port 148.
  • TLB Translation Look-aside Buffer
  • MIB Move In Buffer
  • MOZB I processing unit 147 MOZB I processing unit 147
  • the TLB 143 is a processing unit that performs high-speed address conversion from a virtual address (VA: Virtual Address) to a physical address (PA: Physical Address), and converts a virtual address received from the instruction unit 110 into a physical address. The signal is converted and output to the TAG-MA TCH detector 144.
  • VA Virtual Address
  • PA Physical Address
  • the TAG unit 144 is a processing unit that manages the cache line registered in the cache memory 141, and the cache memory 144 corresponding to the virtual address received from the instruction cut 110.
  • the physical address and thread identifier (ID) of the cache line registered in the location 1 are output to the TAG-MA TCH detection unit 144.
  • the thread identifier is an identifier for identifying whether the cache line is used in “thread 0” or “thread 1”.
  • FIG. 2 is a diagram showing an example of a cache tag which is information for managing a cache line registered in the cache unit 144 of the TAG unit 144.
  • the cache tag uses a V bit that indicates whether the cache line is valid (Valid), an S bit and an E bit that indicate the shared type and exclusive type of the cache line, and a cache line. It includes an ID that indicates the thread that is in use, and a PA that indicates the physical address of the cache line. If the cache line is of the shared type, the cache line may be held by another processor at the same time. If the cache line is of the exclusive type, the cache line may be held at the same time. Is not held by another processor.
  • the TAG-MATCH detection unit 145 is a processing unit that compares the physical address received from the TLB 143 and the thread identifier received from the instruction unit 110 with the physical address and the thread identifier received from the TAG unit 144. .
  • the TAG-MATCH detection unit 145 uses the cache line registered in the cache memory 141 when the physical address and the thread identifier match and the V bit is set, and in other cases, The physical address and the thread identifier are specified to the MI B 146 to instruct the instruction unit 110 to fetch the cache line requested by the instruction unit 110 from the secondary cache unit 300.
  • This TAG-MATCH detection unit 145 In addition to comparing the physical address received from the TLB 143 with the physical address received from the TAG unit 144, by comparing the thread identifier received from the instruction unit 110 with the thread identifier received from the TAG unit 144, Instruction unit 1 10 not only determines whether the cache line is in cache memory 14 1 but also It can be the threads that registered the cache line in the threads and the cache memory 14 1 that requested it is determined force whether the same threads, performs different processing based on the determination result.
  • the MIB 146 is a processing unit that issues a cache line fetch request (Ml request) by designating a physical address to the secondary cache unit 300. Also, corresponding to the cache line fetched by this MI B 146, TA The cache tag of the G section 144 and the contents of the cache memory 141 are updated.
  • Ml request cache line fetch request
  • the MO / BI processing unit 147 is a processing unit that invalidates or discharges a specific cache line in the cache memory 141 based on a request from the secondary cache widget 300.
  • the MOZB I processing unit 147 can set the RIM flag of the fetch port 148 by invalidating or flushing a specific cache line, and the TSO guarantee mechanism between processors can be used as a TS guarantee mechanism between threads. Can be used.
  • the fetch port 148 is a storage unit that stores an access destination address, a PSTV flag, a RIM flag, a RIF flag, and the like in response to each access request from the instruction unit 110.
  • FIG. 3 is a flowchart showing a processing procedure of the cache control unit 142 shown in FIG.
  • the TLB 143 converts a virtual address into a physical address
  • the TAG unit 144 obtains a physical address, a thread identifier, and a V bit from the virtual address using a cache tag.
  • the TAG—MATCH detection unit 145 compares the physical address input from the TLB 143 with the physical address input from the TAG unit 144, and determines whether the cache line requested by the instruction cut 110 is in the cache memory 141. It is checked whether or not it is (step S302). As a result, if the two physical addresses match, the thread identifier input from the instruction unit 110 is compared with the thread identifier input from the TAG unit 144, and the same thread in the cache line in the cache memory 141 is used. It is checked whether it is used (step S303).
  • step S304 when both the thread identifiers match, it is further checked whether or not the V bit is set. As a result, if the V bit is set, the cache line requested by instruction unit 110 is keyed. Since the cache memory 141 has the same thread and the cache line is valid, the cache control unit 142 uses the data in the data section (step S305).
  • the cache control unit 142 uses the data of the fetched cache line (step S307).
  • the TAG-MATCH detection unit 145 determines whether the thread identifier matches not only the physical address but also whether or not the thread identifier matches. By checking, the cache control unit 142 can control a cache line between threads.
  • FIG. 4 is a flowchart showing a processing procedure of the MI processing between the cache control unit 142 and the secondary cache unit 300.
  • This Ml process is a process performed by the secondary cache 300 in step S306 of the cache control unit 142 shown in FIG. 3 and correspondingly.
  • the cache control unit 142 of the primary data cache unit 140 issues an Ml request to the secondary cache unit 300 (step S401). Then, the secondary cache unit 300 checks whether or not the cache line receiving the MI request has been registered in the primary data cache unit 140 at another thread (step S402), and if the cache line has been registered at another thread. In order to set the RIM flag, An MO / BI request is made to the user (step S403).
  • the cache line receiving the Ml request is the primary data cache unit 1
  • the synonym control is a control that manages addresses registered in the primary cache unit on the secondary cache unit side and prevents multiple cache lines of the same physical address from being registered in the primary cache unit. Then, after the MO / BI processing unit 147 of the cache control unit 142 executes the MO / BI processing and sets the RIM flag (step S404), the secondary cache unit 300 The cache control unit 142 sends out the line (step S405) and receives the cache line, and registers the cache line together with the thread identifier (step S406). When the cache line arrives, the RIF flag is set.
  • the cache line receiving the MI request is the next data cache unit 1
  • the secondary cache unit 300 sends out the cache line without making a MO / BI request (step
  • the secondary cache unit 300 uses synonym control to determine whether the cache line receiving the MI request has been registered in the primary data cache unit 140 with another thread. If the thread is registered in another thread, the MOZBI processing section 147 of the cache control section 142 executes the MO / BI processing and sets the RIM flag, and the The TSO guarantee mechanism can be used as a TSO guarantee mechanism between threads.
  • the TAG—MA TCH detection unit 144 of the primary data cache 144 stores a cache line with the same physical address in the cache memory 144. If the thread identifier is different even if the thread is registered, an Ml request is made to the secondary cache unit 300, and the secondary cache unit 300 receives the Ml request from the cache line. Is another thread If it is registered in the primary data cache unit 140 by the 0 command, it requests the cache control unit 142 to execute I processing, and the cache control unit 142 executes MO / BI processing and fetches 'Since the RIM flag of port 148 is set, TS ⁇ between threads can be guaranteed using the TSO guarantee mechanism between processors.
  • the secondary cache unit 300 issues a MOZB I request to the primary data cache unit 140 using synonym control. Due to the increased burden on the knit, secondary cache units may not have synonym control. In such a case, if the cache line with the same physical address is registered in the cache memory with a different thread identifier on the primary data cache unit side, MO / BI processing is performed by itself, and the TSO can be guaranteed.
  • the primary cache unit sends a designated cache line flush request to the secondary cache unit, and the secondary cache unit that receives the request forwards the request to the main storage controller,
  • the cache line is discharged to the main storage device according to the instruction of the device. Therefore, the cache line can be flushed from the primary data cache unit to the secondary cache unit by using the flush operation of the cache line.
  • the case where the RIM flag of the Fetch @ Port is set using the synonym control of the secondary cache unit or the cache line ejection request of the primary data cache unit has been described.
  • the secondary cache has no mechanism for synonym control, and the primary data
  • the cache unit does not have a mechanism to issue a cache line flush request!
  • the second embodiment a description will be given of a case where the TSO is guaranteed by using a process of flushing a replacement block generated when a cache line is replaced and performing invalidation processing and monitoring an access request to a cache memory or a main storage device. I do.
  • the operation of the cache control unit of the primary data cache unit is mainly different from that of the first embodiment, the operation of the cache control unit will be described.
  • FIG. 5 is a functional block diagram showing a configuration of the CPU according to the second embodiment.
  • the CPU 504 has four processor cores 501 to 540 'and a secondary cache unit 550 shared by the four processor cores. Note that the four processor cores 5110 to 5400 all have the same configuration, and therefore, the processor core 510 will be described here as an example.
  • the processor core 510 includes an instruction unit 511, an operation unit 511, a primary instruction cache unit 513, and a primary data cache unit 514.
  • the instruction unit 5 11 1 is a processing unit that decodes and executes an instruction in the same manner as the instruction unit 1 10, and an MT (Multi Thread) control unit controls “thread 0” and “thread 1”, Run two threads simultaneously.
  • MT Multi Thread
  • the arithmetic unit 5 11 1 is a processing unit that executes fixed-point arithmetic and floating-point arithmetic in the same manner as the arithmetic unit 1 20.
  • the primary instruction cache unit 5 13 is Similarly, it is a storage unit that stores part of the main storage device in order to access the instructions stored in the main storage device at high speed.
  • Next data cache unit 5 1 4 is the primary data cache unit 1 4
  • the primary data cache unit 5 1 4 is a storage unit that stores a part of the main storage device in order to access the data stored in the main storage device at a high speed as in the case of 0.
  • the cache control unit 515 does not issue an Ml request from the MIB to the secondary cache unit when a cache line with a matching physical address and a different thread identifier is registered in the cache memory. Instead, the cache control unit 515 performs a replace move-out (M ⁇ R) process on the cache line whose physical address matches, and changes the thread identifier registered in the cache tag.
  • M ⁇ R replace move-out
  • the fetch port is monitored, and if there is a matching address, the RIM flag and the RIF flag are set.
  • the RIF flag can also be set when a different thread writes to the cache memory or main memory. Then, TSO is guaranteed by requesting the instruction to be re-executed when the fetch @ port in which both the RIM flag and the RIF flag are set returns STV.
  • FIG. 5 is an explanatory diagram for explaining the operation of the cache control unit 515.
  • the figure shows the classification of cache access operations according to the instruction that tried to use the cache line and the state of the cache line.
  • the cache access operation of the cache control unit 515 includes “10 patterns” of access patterns and “3 patterns” of operations.
  • the first of the three types of operations is the operation in the case of a cache miss (1 and 6).
  • the Ml request for the cache line is made to the secondary cache unit and the cache line is fetched.
  • the acquired cache line is registered in a shared type, and a cache line is required for data store.
  • the fetched cache line is registered exclusively.
  • the second of the three types of operations is the normal cache hit operation (3, 3, 4, and ⁇ ) when the multi-thread operation is not performed, and is the same as the normal cache hit operation without performing any special processing. Operate, and the state of the cache line does not change.
  • the third of the “three” operations is the case that includes operations that occur to guarantee TSO between threads during multi-thread operation (5, 7, 9, and @). Set the RIM and RIF flags.
  • FIG. 7 is a flowchart showing a processing procedure of the cache control unit 515.
  • the cache control unit 515 checks whether the access requested by the instruction unit 511 is a password or not (step S701).
  • step S701 if the access is a load (step S701 affirmative), a cache miss force is checked (step S702), and if a cache miss, the MIB is secured (step S701). 703), and requests a cache line to the secondary cache unit 550 (step S704). Then, when the cache line arrives, the cache line is registered in a shared type (step S705), and the data in the data section is used (step S706).
  • step S707 it is checked whether or not the hit cache line is registered in the same thread. If the cache line is registered in the same thread, the data section is checked. (Step S706). If the hit cache line is not registered in the same thread, it is checked whether or not the cache line is of a shared type (step S 708). If the data is used (step S706) and it is exclusive type, MOR processing is executed to set the RIM flag and RIF flag. (Step S709), and use the data in the data section (Step S706)
  • step S710 if the access is to the store (No at step S701), it is checked whether or not it is a cache miss (step S710). If the access is a cache miss, the MIB is secured (step S71 1). The cache line is requested to the secondary cache unit 550 (step S712). When the cache line arrives, the cache line is registered as an exclusive type (step S713), and the data is stored in the data section (step S714).
  • step S715 it is checked whether or not the hit cache line is registered with the same thread. If the cache line is registered with the same thread, the cache line is shared. It is checked whether the type is a force exclusive type (step S716). Then, if it is of the exclusive type, the data is stored in the data section (step S714). On the other hand, in the case of the shared type, the M ⁇ R processing is executed to set the RIM flag and the RIF flag (step S717), invalidate the cache line of another processor core (step S718), and The line is changed to the exclusive type (step S719), and the data is stored in the data section (step S714).
  • step S720 If the hit cache line is not registered with the same thread, the MOR processing is executed and the RIM flag and RIF flag are set (step S720), and the power exclusive type in which the cache line is shared is used. (Step S716). Then, if it is of the exclusive type, the data is stored in the data section (step S714). On the other hand, in the case of the shared type, the cache line of the other processor core is invalidated (step S718), the cache line is changed to the exclusive type (step S719), and the data is stored in the data section (step S718). 71 4).
  • FIG. 8 is a flowchart showing a processing procedure of the MOR processing.
  • the MOR process secures a MIB (step S801) and starts a replace move-out operation.
  • step S802 half of the cache line is read out to the replacement move buffer (step S802), and it is checked whether or not the force in which the replacement move-out is prohibited (step S803).
  • a case where the replace move-out is prohibited is a case where a special instruction such as compare and swap is trying to use the cache line.
  • the data in the replace move-out buffer is not used.
  • step S802 the process returns to step S802, and the replacement move-out buffer is read again.
  • step S804 the replacement move-out buffer is read again.
  • the MOR processing executes the replacement part operation, so that the TSO guarantee mechanism between the processor cores works, and the PSTV flag is set using the same cache line as the replacement moveout.
  • the TSO guarantee mechanism between the processor cores can function as the TSO guarantee mechanism between the threads.
  • different threads may compete for the same cache line on the same processor core.
  • the operation is performed when different processors compete for the same cache line in a multiprocessor environment.
  • each processor has a cache line discharge prohibition control and a control for forcibly disabling it. That is, the processor holding the cache line tries to wait for the cache line to be discharged until the store is completed. This is the cache line ejection prohibition control.
  • the cache pipeline flush processing for processing a cache line flush request received from another processor has failed a certain number of times in the cache pipeline, the store to that cache line is forcibly stopped.
  • the cache line is successfully ejected once. As a result, the cache line is passed to another processor. After that, if the store to the cache line is to be continued, a request to flush the cache line is sent to another processor. As a result, the cache line will eventually arrive and the store can be continued.
  • Such a mechanism that operates when different processors exchange the same cache line in a multi-processor environment is capable of operating even in a replace move-out operation used in passing a cache line between threads. Therefore, in any case, the cache line was successfully transferred between the threads, and the hang operation could be prevented.
  • the cache control unit 515 of the primary data cache 514 monitors access to the cache memory or the main storage device, and a TSO violation may occur.
  • the MOR processing is executed to set the RIM flag and RIF flag, so that the TS ⁇ guarantee mechanism between the processor cores can work as the Tso guarantee mechanism between threads.
  • the shared cache line is shared between different threads.
  • the present invention is not limited to this, and the shared cache line is also exclusive cache. Exclusive between threads like a line The same can be applied to control.
  • the TSO guarantee mechanism between processor cores works as a TSO guarantee mechanism between threads by executing MOR processing when the load of a cache line registered by another thread hits. Can be.
  • the present invention is not limited to this, and the three or more threads may be executed at one time by the instruction unit. The same can be applied to cases where processing is performed.
  • the simultaneous multi-thread method is a method in which a plurality of threads are processed at one time.
  • the multi-thread method there is also a time-division multi-thread method in which only one thread is processed at a time, and a thread is switched at regular intervals or when it is found that instruction execution is delayed due to a cache miss or the like. Therefore, TSO guarantee in the case of the time-division multi-thread method will be described.
  • the running thread is put to sleep and the thread is switched by starting another thread operation. Therefore, when switching threads, all fetch instructions and store instructions issued from the sleeping thread and not committed are canceled. By doing so, it is possible to avoid Tso violations that may occur from the store of another thread due to out-of-order completion of the fetch instruction.
  • the committed store instruction is waited for execution in the store port or the write buffer holding the store request and the store data until the write to the cache memory or the main storage device becomes possible.
  • Run the store when the result of the preceding store must be reflected by the subsequent fetch, that is, when the subsequent fetch uses the memory area to be operated by the preceding store, the address and operand length of the store request are used.
  • the fetch request key It is detected by comparing the dress and the operand length. In this case, the execution of the fetch is made to wait until the execution of the store is completed by SFI (Store Fetch Interlock).
  • the SFI operation is enabled to reflect the effect of the store from the different thread. This avoids TSO violations caused by different thread stores during thread dormancy.
  • the thread that registered the stored data and the thread that issued the memory access request are the same. Consistency that guarantees the consistency of the read and write execution order among the plurality of instruction processing devices with respect to the data shared among the plurality of instruction processing devices based on the determination result. Since the security mechanism is configured to operate, there is an effect that the consistency of the execution order of reading and writing of shared data between threads can be guaranteed. Further, according to the present invention, when a cache line including data at an address specified by a memory access request is stored, the thread at which the stored cache line is registered and the thread at which the memory access request is issued are the same.
  • the primary data cache device is provided with a cache line for the secondary cache device when a cache line whose physical address matches the memory access request from the instruction processing device is registered by different threads. If the cache line for which the cache line fetch request has been received is registered in the primary data cache device by another thread, the secondary cache device invalidates or flushes the cache line.
  • a cache line shared with another set of primary data cache devices by making a request to the secondary data cache device, and the primary data cache device invalidating or flushing the cache line based on the request from the secondary cache device.
  • the storage control device, the data cache control device, the central processing unit, the storage device control method, the data cache control method, and the cache control method according to the present invention provide a multi-threaded computer system that executes a plurality of threads simultaneously. Suitable for stem.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

L'invention concerne une unité centrale comprenant une pluralité de jeux d'un dispositif de traitement d'instructions, traitant simultanément une pluralit d'unités d'exécution et un dispositif primaire de cache de données, ainsi qu'un dispositif secondaire de cache de données, partagé par les dispositifs primaires de cache de données de la pluralité de jeux. L'unité centrale comprend une unité primaire de cache de données et une unité de cache secondaire. Même si les lignes de cache dont les adresses physiques sont identiques sont enregistrées dans une mémoire cache, si des identifications d'unités d'exécution sont différentes, l'unité primaire de cache de données effectue une demande de MI à l'unité secondaire de cache, effectue un MO/MI, conformément à une demande émanant de l'unité secondaire de cache et détermine un indicateur de RIM de port d'extraction. Si la ligne de cache qui a reçu la demande de MI est enregistrée dans l'unité primaire de cache de données par une autre unité d'exécution, l'unité de cache secondaire demande à la l'unité de cache primaire de procéder au MO/BI.
PCT/JP2003/000723 2003-01-27 2003-01-27 Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache WO2004068361A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2004567505A JP4180569B2 (ja) 2003-01-27 2003-01-27 記憶制御装置、データキャッシュ制御装置、中央処理装置、記憶装置制御方法、データキャッシュ制御方法およびキャッシュ制御方法
PCT/JP2003/000723 WO2004068361A1 (fr) 2003-01-27 2003-01-27 Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache
US11/123,140 US20050210204A1 (en) 2003-01-27 2005-05-06 Memory control device, data cache control device, central processing device, storage device control method, data cache control method, and cache control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2003/000723 WO2004068361A1 (fr) 2003-01-27 2003-01-27 Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/123,140 Continuation US20050210204A1 (en) 2003-01-27 2005-05-06 Memory control device, data cache control device, central processing device, storage device control method, data cache control method, and cache control method

Publications (1)

Publication Number Publication Date
WO2004068361A1 true WO2004068361A1 (fr) 2004-08-12

Family

ID=32800789

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2003/000723 WO2004068361A1 (fr) 2003-01-27 2003-01-27 Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache

Country Status (2)

Country Link
JP (1) JP4180569B2 (fr)
WO (1) WO2004068361A1 (fr)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008544417A (ja) * 2005-06-29 2008-12-04 インテル コーポレイション キャッシュする方法、装置及びシステム
WO2008155822A1 (fr) * 2007-06-19 2008-12-24 Fujitsu Limited Contrôleur de mémoire cache et procédé de commande
WO2008155829A1 (fr) * 2007-06-20 2008-12-24 Fujitsu Limited Processeur d'informations, contrôleur de mémoire cache et procédé d'assurance de séquence d'accès à une mémoire
JP2009288977A (ja) * 2008-05-28 2009-12-10 Fujitsu Ltd キャッシュメモリ制御装置、半導体集積回路、およびキャッシュメモリ制御方法
JP4710024B2 (ja) * 2007-06-20 2011-06-29 富士通株式会社 キャッシュメモリ制御装置およびキャッシュメモリ制御方法
JP2011134205A (ja) * 2009-12-25 2011-07-07 Fujitsu Ltd 情報処理装置およびキャッシュメモリ制御装置
US8244985B2 (en) 2004-03-30 2012-08-14 Intel Corporation Store performance in strongly ordered microprocessor architecture
US8261021B2 (en) 2007-06-20 2012-09-04 Fujitsu Limited Cache control device and control method
WO2013084314A1 (fr) * 2011-12-07 2013-06-13 富士通株式会社 Unité de traitement et procédé de commande d'unité de traitement
JP2014006807A (ja) * 2012-06-26 2014-01-16 Fujitsu Ltd 演算処理装置、キャッシュメモリ制御装置及びキャッシュメモリの制御方法
JP2014112383A (ja) * 2013-12-19 2014-06-19 Intel Corp セキュアなアプリケーションの実行を提供するプロセッサ
US9390012B2 (en) 2010-06-14 2016-07-12 Fujitsu Limited Multi-core processor system, cache coherency control method, and computer product
CN111652749A (zh) * 2018-11-28 2020-09-11 阿里巴巴集团控股有限公司 信息核查方法以及装置
JP7160514B2 (ja) 2015-08-12 2022-10-25 エヌエイチエヌ コーポレーション モバイル環境におけるリソースダウンロード方法、記録媒体、およびリソースダウンロードシステム

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5088754B2 (ja) 2009-12-18 2012-12-05 インターナショナル・ビジネス・マシーンズ・コーポレーション システム、方法、プログラムおよびコード生成装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5257354A (en) * 1991-01-16 1993-10-26 International Business Machines Corporation System for monitoring and undoing execution of instructions beyond a serialization point upon occurrence of in-correct results
US5265233A (en) * 1991-05-17 1993-11-23 Sun Microsystems, Inc. Method and apparatus for providing total and partial store ordering for a memory in multi-processor system
US5699538A (en) * 1994-12-09 1997-12-16 International Business Machines Corporation Efficient firm consistency support mechanisms in an out-of-order execution superscaler multiprocessor
US6122712A (en) * 1996-10-11 2000-09-19 Nec Corporation Cache coherency controller of cache memory for maintaining data anti-dependence when threads are executed in parallel
WO2001025903A1 (fr) * 1999-10-01 2001-04-12 Sun Microsystems, Inc. Procede de routine de deroutement precise en cas de charges speculatives et defectueuses
US20030014602A1 (en) * 2001-07-12 2003-01-16 Nec Corporation Cache memory control method and multi-processor system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5257354A (en) * 1991-01-16 1993-10-26 International Business Machines Corporation System for monitoring and undoing execution of instructions beyond a serialization point upon occurrence of in-correct results
US5265233A (en) * 1991-05-17 1993-11-23 Sun Microsystems, Inc. Method and apparatus for providing total and partial store ordering for a memory in multi-processor system
US5699538A (en) * 1994-12-09 1997-12-16 International Business Machines Corporation Efficient firm consistency support mechanisms in an out-of-order execution superscaler multiprocessor
US6122712A (en) * 1996-10-11 2000-09-19 Nec Corporation Cache coherency controller of cache memory for maintaining data anti-dependence when threads are executed in parallel
WO2001025903A1 (fr) * 1999-10-01 2001-04-12 Sun Microsystems, Inc. Procede de routine de deroutement precise en cas de charges speculatives et defectueuses
US20030014602A1 (en) * 2001-07-12 2003-01-16 Nec Corporation Cache memory control method and multi-processor system

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8244985B2 (en) 2004-03-30 2012-08-14 Intel Corporation Store performance in strongly ordered microprocessor architecture
JP2008544417A (ja) * 2005-06-29 2008-12-04 インテル コーポレイション キャッシュする方法、装置及びシステム
KR101077514B1 (ko) 2007-06-19 2011-10-28 후지쯔 가부시끼가이샤 캐시 제어장치 및 제어방법
WO2008155822A1 (fr) * 2007-06-19 2008-12-24 Fujitsu Limited Contrôleur de mémoire cache et procédé de commande
US8412886B2 (en) 2007-06-19 2013-04-02 Fujitsu Limited Cache controller and control method for controlling access requests to a cache shared by plural threads that are simultaneously executed
JPWO2008155822A1 (ja) * 2007-06-19 2010-08-26 富士通株式会社 キャッシュ制御装置及び制御方法
JP4706030B2 (ja) * 2007-06-19 2011-06-22 富士通株式会社 キャッシュ制御装置及び制御方法
US8103859B2 (en) 2007-06-20 2012-01-24 Fujitsu Limited Information processing apparatus, cache memory controlling apparatus, and memory access order assuring method
WO2008155829A1 (fr) * 2007-06-20 2008-12-24 Fujitsu Limited Processeur d'informations, contrôleur de mémoire cache et procédé d'assurance de séquence d'accès à une mémoire
JP4710024B2 (ja) * 2007-06-20 2011-06-29 富士通株式会社 キャッシュメモリ制御装置およびキャッシュメモリ制御方法
JPWO2008155829A1 (ja) * 2007-06-20 2010-08-26 富士通株式会社 情報処理装置,キャッシュメモリ制御装置およびメモリアクセス順序保証方法
JP4983919B2 (ja) * 2007-06-20 2012-07-25 富士通株式会社 演算処理装置および演算処理装置の制御方法
US8261021B2 (en) 2007-06-20 2012-09-04 Fujitsu Limited Cache control device and control method
JP2009288977A (ja) * 2008-05-28 2009-12-10 Fujitsu Ltd キャッシュメモリ制御装置、半導体集積回路、およびキャッシュメモリ制御方法
JP2011134205A (ja) * 2009-12-25 2011-07-07 Fujitsu Ltd 情報処理装置およびキャッシュメモリ制御装置
US9390012B2 (en) 2010-06-14 2016-07-12 Fujitsu Limited Multi-core processor system, cache coherency control method, and computer product
WO2013084314A1 (fr) * 2011-12-07 2013-06-13 富士通株式会社 Unité de traitement et procédé de commande d'unité de traitement
JP2014006807A (ja) * 2012-06-26 2014-01-16 Fujitsu Ltd 演算処理装置、キャッシュメモリ制御装置及びキャッシュメモリの制御方法
JP2014112383A (ja) * 2013-12-19 2014-06-19 Intel Corp セキュアなアプリケーションの実行を提供するプロセッサ
JP7160514B2 (ja) 2015-08-12 2022-10-25 エヌエイチエヌ コーポレーション モバイル環境におけるリソースダウンロード方法、記録媒体、およびリソースダウンロードシステム
CN111652749A (zh) * 2018-11-28 2020-09-11 阿里巴巴集团控股有限公司 信息核查方法以及装置
CN111652749B (zh) * 2018-11-28 2024-04-16 创新先进技术有限公司 信息核查方法以及装置

Also Published As

Publication number Publication date
JP4180569B2 (ja) 2008-11-12
JPWO2004068361A1 (ja) 2006-05-25

Similar Documents

Publication Publication Date Title
JP2566701B2 (ja) 共有キャッシュ内のデータ・ユニットに対する所有権の変更制御装置
JP4208895B2 (ja) キャッシュメモリ装置および処理方法
US8301843B2 (en) Data cache block zero implementation
JP4982375B2 (ja) 複数のコアを介してのモニタリングされたキャッシュラインの共有
KR100228940B1 (ko) 메모리 일관성 유지 방법
JP4376692B2 (ja) 情報処理装置、プロセッサ、プロセッサの制御方法、情報処理装置の制御方法、キャッシュメモリ
WO2004068361A1 (fr) Dispositif de commande de memorisation, dispositif de commande de cache de donnees, unite centrale, procede de commande de dispositif de memorisation, procede de commande de cache de donnees et procede de commande de cache
US9547596B2 (en) Handling of a wait for event operation within a data processing apparatus
US20080082796A1 (en) Managing multiple threads in a single pipeline
JPH0785222B2 (ja) データ処理装置
JPH0239254A (ja) データ処理システム及びそのキヤツシユ記憶システム
US20060064518A1 (en) Method and system for managing cache injection in a multiprocessor system
JPH0668735B2 (ja) キヤツシユメモリ−
JPH0670779B2 (ja) フェッチ方法
US20090106498A1 (en) Coherent dram prefetcher
US9424190B2 (en) Data processing system operable in single and multi-thread modes and having multiple caches and method of operation
JP3862959B2 (ja) マイクロプロセッサのロード/ストア命令制御回路、およびロード/ストア命令制御方法
JP2022526057A (ja) メモリ順序付け違反チェックバッファの排出遅延を許容するための投機的命令ウェイクアップ
US20050210204A1 (en) Memory control device, data cache control device, central processing device, storage device control method, data cache control method, and cache control method
EP0374370B1 (fr) Procédé de mémorisation dans des lignes d'antémémoire non exclusives dans des systèmes multiprocesseurs
US20070112998A1 (en) Virtualized load buffers
CN101833517B (zh) 快取存储器系统及其存取方法
US6266767B1 (en) Apparatus and method for facilitating out-of-order execution of load instructions
US7975129B2 (en) Selective hardware lock disabling
JP3320562B2 (ja) キャッシュメモリを有する電子計算機

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

WWE Wipo information: entry into national phase

Ref document number: 2004567505

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 11123140

Country of ref document: US

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载