+

WO1999038085A1 - Procede et appareil d'application d'execution ordonnee de lecture et d'ecriture a travers une interface de memoire - Google Patents

Procede et appareil d'application d'execution ordonnee de lecture et d'ecriture a travers une interface de memoire Download PDF

Info

Publication number
WO1999038085A1
WO1999038085A1 PCT/US1999/001387 US9901387W WO9938085A1 WO 1999038085 A1 WO1999038085 A1 WO 1999038085A1 US 9901387 W US9901387 W US 9901387W WO 9938085 A1 WO9938085 A1 WO 9938085A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
requests
processor
interface
reordering
Prior art date
Application number
PCT/US1999/001387
Other languages
English (en)
Inventor
Robert F. Sproull
Original Assignee
Sun Microsystems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems, Inc. filed Critical Sun Microsystems, Inc.
Priority to JP2000528921A priority Critical patent/JP2002510079A/ja
Priority to AU23361/99A priority patent/AU2336199A/en
Priority to EP99903307A priority patent/EP1047996A1/fr
Publication of WO1999038085A1 publication Critical patent/WO1999038085A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/30087Synchronisation or serialisation instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/161Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement
    • G06F13/1621Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement by maintaining request order
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/3834Maintaining memory consistency

Definitions

  • the present invention relates to read/write interfaces between processors and memories. More generally, it relates to interfaces between clients of a memory mapped resource and that resource. In a particular embodiment, the invention provides a solution to the problem of efficiently using the interface while still ensuring that reads and writes are performed in proper sequence when a particular sequence is required.
  • Memory refers to a memory system, which may include data paths, controller chips, buffers, queues, and memory chips. While this disclosure describes the problems and solutions in data storage memory, it should be understood that the problems and solutions can be generalized in many cases for memory-mapped circuits which perform more than just storage of data (e.g,, memory-mapped I/O, memory-mapped compute devices) .
  • a “memory location” (or simply “a location”) is an individually addressable unit of the memory that can be addressed and that holds data (or transports the data to and/or from an I/O device or a compute device) .
  • a "client” is a central processing unit (CPU) , processor, I/O controller or other device which uses the services provided by the memory system.
  • CPU central processing unit
  • processor processor
  • I/O controller I/O controller
  • a "request” is an action performed by a client in using the services of a memory system.
  • a "read request” (or simply “a read”) is a request from a client to the memory requesting the contents of a memory location specified by an address of the memory location to be read; the read request is accompanied by the address of the read memory location.
  • a "write request” (or simply “a write”) is a request from a client to the memory requesting that the memory place a write value into a write memory location; the write request is accompanied by the write value and the address of the write memory location.
  • An “acknowledgment” (or simply “an ack”) is an indication returned by the memory to the client indicating that a request has been satisfied; an acknowledgment to a read request includes the data read from the specified memory location.
  • Pending reads is the set of read requests which are pending; a read request is “pending” from the time it is accepted by the memory until the memory issues an ack.
  • Pending writes analogous to pending reads, is the set of write requests which are pending; a write request is “pending” from the time it is accepted by the memory until the memory issues an acknowledgment.
  • concurrency When building memory systems for large computers, one feature which provides for high performance is concurrency, wherein more than one memory operation is in progress at the same time.
  • concurrency One limitation on concurrency is that a CPU, or other client, requires memory consistency. A memory appears consistent when a "read" of a memory location returns a value most recently "written” in that location. In some systems with concurrency, reads and writes are reordered into an optimized execution order to achieve higher performance, however this may lead to loss of consistency.
  • Consistency is easy to implement if memory requests are always processed in exactly the same order as they are issued by the client. Preserving the order exactly, however, is often not possible in high-performance memory designs which may need to reorder requests to speed up processing. For example, the system requirements might be such that read requests must be completed faster than writer requests because pending read requests hold up processing until the read data is returned. However reordering of requests is done, it must not violate the consistency that is inherent in the one-request-at-a-time memory model described above.
  • One set of reordering constraints are as follows:
  • Rule 1 A read of location X followed by a write of location X cannot be reordered among themselves.
  • Rule A is often implemented by adding "store buffers" to the processor.
  • Rule B is almost never implemented because its performance advantage is very slight. Nonetheless, Rules A and B give some insight into what can be done at the processor to increase concurrency and thus improve performance while still maintaining consistency.
  • MEMBAR An MEMBAR instruction provides a way for a programmer to enforce an order of reads and writes issued by client.
  • MEMBAR instructions are interspersed in instructions codes executed by a processor. When a processor is executing instructions and encounters a MEMBAR instruction, it holds up further read and write operations until the operations which preceded the MEMBAR instructions have completed.
  • Sproull -Sutherland discloses a method of determining whether the read and write operations have been completed (that patent/application is 5 commonly assigned to the assignee of the present application and is incorporated herein by reference for all purposes) .
  • SUN SPARC-V9 manual that reference explains how the ordering constraints are enforced by the processor. There, given a first operation and a second operation, if the second operation must not be performed before the first operation, the execution unit delays the submission of the second operation to the memory until the first operation is no longer pending.
  • the client maintains its record of pending reads and writes by noting (a) when it issues each new request and (b) when each request is eventually acknowledged, signifying that the request is no longer pending.
  • the processor holds up an operation instead of using a bandwidth-limited interface whenever the interface is available, performance may be lost as extra time would be needed to send the held-up request and the critical path involving that request would be lengthened. 6
  • Fenwick appears to show how barrier instructions operate in the context of the Alpha 21164 microprocessor, built by Digital Equipment Corporation.
  • a barrier instruction MB or "memory barrier”
  • the MB instruction is reported off-chip, and may be used at the interface between the microprocessor and the memory bus, but the MB instructions do not apparently pass over the memory bus .
  • the MB information is not needed beyond the bus.
  • a similar instruction is used in the memory interface of most microprocessors (for example, waiting for all pending memory transactions to complete before allowing any new memory requests to be issued) , but the interface circuitry is commonly provided on the microprocessor chip itself.
  • processor-memory interface which allows the processor to enforce execution order of concurrently submitted operations, even when multiple operations required to be ordered are submitted to the memory which may reorder operations for its own purposes.
  • a memory interface is provided between a processor and a memory which is capable of multiple concurrent transactions or accesses.
  • the interface between the processor and the memory carries read and write operations as well as "barrier" operations, where a barrier operation signals the non-reorderability of operations.
  • the barrier operations are used in connection with resolved regions and unresolved regions of a processor system's architecture.
  • the unresolved region is a region wherein operations may be reordered for efficiency or other reasons and the resolved region is a region wherein the 8 operations are in a fixed order from which they cannot be reordered.
  • reordering constraints can survive the travel through the unresolved region so that any necessary reordering between the unresolved region and the resolved region can occur. Since the unresolved region extends into the memory, it is possible for the memory to perform optimization reordering or other reordering of operations. Once the operations reach the boundary between the unresolved region and the resolved region, the operations are reordered, as needed, to comply with the constraints of the barrier operations .
  • the memory interface is an interface to one or more memory mapped input/output (I/O) devices or computational devices.
  • memory operations are initiated by more than one processor.
  • processor-memory boundary While the exact location of the processor-memory boundary might not be clear, the present invention is useful wherever reordering dictated by the memory system is being performed, as opposed to reordering dictated only by the processor.
  • FIG. 1 is a block diagram of a processing system according to the present invention.
  • FIG. 2 is a block diagram of a multiple processor processing system according to the present invention.
  • FIG. 3 shows an example of a request stream as might be used in the present invention.
  • FIG. 4 shows a variation of the request stream of
  • FIG. 3 as might be used in a dual -path request stream system.
  • FIG. 5 is a block diagram of a banked memory system according to the present invention. 9
  • a client specifying the sequence must be able to also specify when a portion of the sequence must be handled in the order specified and not reordered.
  • a processor might specify a sequence of memory requests. The memory requests are executed at a memory, after passing through processor logic and buffers, a processor-memory bus (which might be shared with more than one processor and/or more than one memory) , and an memory interface circuits interposed between the bus and the memory . Any of these intermediate elements might be adapted to reorder the sequence of memory requests.
  • a memory interface circuit includes a paging unit, for loading and unloading pages of memory from a slow, large memory to a fast, core memory where all memory requests happen within the core memory
  • the memory interface circuit might reorder memory requests so that all the requests to be done within one page of memory are done at once to reduce the number of page swaps required to fulfill all of the memory requests.
  • the order of the requests can be determined at any point in the path of these memory requests from a processor to a memory. However, if a system does reorder requests, there are some points in the path where the order of the memory requests is not 10 necessarily resolved or resolvable. The collection of these points is referred to herein as the "unresolved region" of the path and an "unresolved" pathway is a pathway, such as a network or a bus, which carries requests in an unresolved order. When a processor must be able to specify a particular order of handling, at some point the unresolved region must end. Beyond that point is referred to herein as the "resolved region" of the path.
  • operations are in a fixed order from which they will not be further reordered.
  • FIGS. 1-2 processor systems in which an order enforcing system according to the present invention might be used as there shown.
  • FIG. 1 shows a processor system 100 comprising a processor 102, a memory subsystem 104, an I/O subsystem 106 and a compute subsystem 108. Each of these components is coupled via a communications link 110. Each of the three subsystems is memory-mapped, i.e., processor 102 interfaces with the subsystem using addressed read and write requests as if the subsystem were an addressable memory. Communications link 110 could be a memory bus, a network, or the like.
  • Memory subsystem 104 is shown comprising a memory interface circuit 120 and a memory array 122.
  • processor 102 sends a read request or a write request over communications link 110 to memory subsystem 104, it is received and handled by memory interface circuit 120 and interface circuit 120 handles the storage and retrieval of data in the specified locations of memory array 122.
  • I/O subsystem 106 is shown comprising a memory-mapped I/O interface circuit 130 and the I/O devices are shown generally as 132.
  • Interface circuit 130 receives and handles requests from processor 102 in much the same way as interface circuit 120 handles memory requests.
  • Memory-mapped I/O is not the focus of this description and is well known, so many details are omitted here for brevity. 11
  • Compute subsystem 108 is shown comprising a memory-mapped compute interface circuit 140 and a compute device 142.
  • interface circuit 140 receives and handles requests from processor 102 over communications link 110.
  • Interface circuit 140 converts requests, which are formatted as requests to a particular memory location, into messages to and from compute device 142 according to a predetermined memory map and convention.
  • computer device 142 might be a floating point processor and, by convention, a read request from a particular memory address might be a request for the results of a floating point operation, while a write request to a particular memory address might be an operand used in a floating point operation.
  • the reordering enforcement system according to the present invention is not limited to bus-based systems.
  • I/O subsystem 106 and compute subsystem 108 are not as important as the understanding that ordering of requests can be important in these subsystems. For example, if processor 102 sends a write request to I/O subsystem 106 to configure an external I/O device (such as initializing a serial communications circuit) then processor 102 sends a read request to gather data from that I/O device, those requests should appear at the I/O device in the order required by processor 102.
  • interface circuit 140 or communications link 110 should return those requests to an order relative to each other that is the order in which processor 102 sent the requests, if either of those devices reordered the requests for its own internal efficiency.
  • the request stream shows read requests, write requests and barrier requests.
  • a barrier is sent to the memory subsystem to signal that the subsystem should not reorder requests across the barrier, i.e., that all requests received prior to the barrier must be handled before any request received after the barrier.
  • the barrier requests are indicated by the label "MEMIBAR" which is short for "memory interface barrier.” MEMIBAR requests should not be confused with MEMBAR instructions, which are instructions inserted into a program to control the operation of a processor. By contrast, the MEMIBAR requests are sent from the processor to the memory subsystem to enforce ordering.
  • requests are shown being sent to a memory subsystem, in order, from request 1 to request 14.
  • read requests include an address
  • write requests include an address and the data to be written (as the actual data is not relevant here, it is shown in FIG. 3(a) as "xx") .
  • requests 5 and 11 are barriers, and therefore the memory subsystem is free to reorder requests 1-4 among themselves, 6-10 among themselves and 12-14 among themselves.
  • the memory subsystem might otherwise group requests dealing with one page (requests 1, 4 and 6) to perform them before a page swap and group the remainder of the requests to perform them after the page swap.
  • request 5 is a barrier
  • request 6 cannot be reordered for execution before the page swap because that would require it to be executed before requests 2-3.
  • the barrier at request 5 ensures that request 6 (a write to address 309F) does not get reordered relative to request 4 (a read from address 309F) . This is necessary to ensure that the correct, pre-write, value is returned for the read request.
  • FIG. 3(b) shows an example of an alternate form of a request stream, wherein the over-restrictiveness can be avoided.
  • the barrier requests include an address to indicate the requests for which the barrier applies.
  • request 5 (“MEMIBAR 309F")
  • that barrier constrains only the relative reordering of requests which deal with address 309F, namely requests 4 and 6.
  • the memory subsystem can reorder requests 4 and 6 relative to requests 2 and 3 for more efficient paging.
  • request 14 can be reordered relative to requests 10-13, thereby allowing two read requests to be handled with a single read, as might occur when two processors are reading the same memory address .
  • barrier requests include addresses
  • excess of barrier requests e.g., a multitude of consecutive barrier requests, one per address being constrained by a barrier
  • the processor need not introduce barrier requests to enforce ordering of requests when one of the requests has already been acknowledged by the memory system. For example, the writes to location 3108 that appear on lines 3 and 13 must be ordered, but this example assumes that by the time the request on line 13 is issued, the request on line 3 has been acknowledged. In the Sproull-Sutherland dual-path memory, a more complex barrier procedure is needed.
  • the client a processor, in this example
  • HB half barrier
  • the client retains a record of pending reads and pending writes, and checks a new request before sending it to the memory. If there is a possible conflict, the client first 15 sends the HB markers and then issues the new request .
  • the memory system obeys the following rules:
  • Rule M3 requires that the paths be synchronized by HB markers. One way to do this is, when one path (read or write) processes an HB marker, the memory must hold that path up until the other path (write or read, respectively) reaches an HB marker. Intermediate elements which handle requests, but which are elements that need not serialize memory accesses, need not hold up for HB markers. Thus, Rule M3 should be applied only to elements which must serialize requests, such as the read/write interface at a memory chip.
  • read requests, write requests and HB markers travel through the memory system, they eventually come to a "memory chip" itself.
  • reads and writes may be traveling in separate paths, much like a two-ported memory (i.e., having separate read and write ports) . These ports may be designed with a "recursive interface" as described by Sproull -Sutherland. Inside this memory chip, the read and write paths finally meet, both potentially accessing the same memory location. To avoid consistency problems, Rule M3 is enforced there.
  • HB markers is inserted into the memory system by the client and those markers meet together at the memory chip, where they are used to synchronize the read and write channels.
  • the memory system may then apply arbitrary policies to requests, give priority to reads, reorder writes with respect to each 16 other (e.g., to take advantage of fast "page mode" on memory chips) , etc.
  • FIG. 4 shows the half barriers with addresses, as is the case in FIG. 3(b), but the dual -path memory system could also be implemented without half barrier addresses, as is the case in FIG. 3(a) .
  • the HB markers would be more powerful than necessary to establish the required ordering constraints if they did not include addresses. If an HB marker must be inserted before a read of location X or before a write of location X, it is because there is a pending read or write request for location X that might conflict. In such cases, the memory subsystem need only guarantee that pending requests for location X are not reordered with respect to the marker. Therefore, if the address X is attached to the HB marker, potential conflicts can be avoided without excessive restraint.
  • address bits to use in the marker might vary depending on the configuration of the memory system or the characteristics of the client.
  • the subset might be the "low order" bits of the address, or the "high order” bits of the address. Those skilled in the art of memory design will recognize that these are only examples and that other subsets of address bits could be used. 17
  • the objective in associating full or partial addresses with a marker is to reduce the frequency with which markers must be introduced into the memory system. The reason is that markers will prevent the memory system from reordering memory requests to achieve maximum performance.
  • bank With full or partial address tags, order enforcement can also be used with "banked” memory.
  • Large memories are often composed of memory banks, i.e., each bank is responsible for a range of memory locations.
  • the memory system has some form of “distributor” that accepts memory requests and distributes them to the proper bank, i.e., the bank that contains the memory location specified in the read or write request.
  • One special form of bank structure is known as "interleaving" in which low-order address bits select a memory bank. For example, in a two-bank system, even addresses are located in one bank and odd addresses in another.
  • the distributor delivers memory requests according to which bank contains the addressed location. It must also deliver markers. Markers without addresses must be delivered to every bank, because the memory system cannot know which requests are being prevented from reordering. Thus, for example, in a two-bank system, when the distributor receives a marker along the read path, it must send a marker to each of the two banks along their respective read paths. However, if a marker contains an address, it is necessary to forward that marker only to the one bank that contains that address . Note that partial address tags can be used with banked memory to the same effect, so long as the tag identifies the bank.
  • FIGS. 3-4 show operations on single memory addresses, it should be understood that the amount of memory processed as part of a particular request is not fixed, but can vary.
  • a memory system might be configured to handle several sizes of requests, such as a single word read, a cache line read (e.g., 16 words in a cache line), single word write, cache line write, or writing selected bytes within a word.
  • the "address" of the read or write request is the address of the first word of what may be a multi-word request. This may be important in deciding 18 whether a barrier that contains an address (first word address) can be reordered with respect to another request.
  • FIG. 5 is a block diagram of a banked memory subsystem 500 illustrating these points.
  • Banked memory subsystem 500 is shown with an interface circuit 501 coupling subsystem 500 and a processor and a distributor 502 which routes memory requests to the appropriate memory bank.
  • Two bank memories 503 are shown, but it should be understood that the memory can be divided into more than two banks.
  • read requests and write requests travel along separate paths, namely read path 504 and write path 506.
  • Distributor 502 examines the address of each request and, in this example, routes requests with odd addresses to bank memory 503 (1) using a bank read path 508(1) and a bank write path 510(1) and routes requests with even addresses to bank memory 503(2) using a bank read path 508(2) and a bank write path 510(2) . Reordering for memory optimization might occur at either interface circuit 501, distributor 502 or at the inputs of bank memories 503. The flow of barrier requests will now be described.
  • interface circuit 501 sends one half barrier along path 504 and one half barrier along path 506. As explained above, this will allow bank memory controller 500 to prevent reordering of read and write requests relative to each other even though they travel along separate paths .
  • the half barriers are detected by distributor 502
  • the half barrier from read path 504 is sent along bank read paths 508 and the half barrier from write path 506 is sent along bank write paths 510. Since a half barrier received on one of bank paths 508 or 510 will hold up memory accesses until the matching half barrier arrives, the broadcasting of half barriers to all bank memories 503 might be overly restrictive.
  • the barriers can include addresses as described above, or can include enough of a partial address so that distributor 502 can identify the bank to which the barrier applies. If such addresses or partial addresses are included, distributor 502 19 can selectively route the half barriers to only the bank memory containing the address for which the barrier applies.
  • distributor 502 does not need to hold up for half barrier markers to synchronize, but will send along half barrier markers and read or write request as received.
  • a processor has an interface to a distributed memory having both local memory and "remote" memory, where the remote memory is connected to the processor by a high speed network, a bus extension, or the like.
  • Half barrier markers will be sent to the remote memories just as they are sent to banks in the example shown in FIG. 5.
  • the systems described above implement marker (half barrier) synchronization at the memory chip interface, other variations are possible. All that is required is that the read and write paths synchronize at some point in their processing, beyond which no reordering is permitted (i.e., there is a nonzero "resolved” region). Such a synchronization point is possible at many different points in a memory system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Multi Processors (AREA)

Abstract

L'invention concerne une interface de mémoire située entre un processeur et un sous-système de mémoire capable de permettre plusieurs transactions ou accès concurrents. L'interface entre le processeur et la mémoire permet d'effectuer des opérations de lecture et d'écriture ainsi que des opérations de 'barrière', une opération de barrière signalant que des opérations ne peuvent être ordonnées de nouveau. Dans un autre aspect de l'invention, l'interface de mémoire fait interface avec un ou plusieurs dispositifs d'entrée/sortie (I/O) ou dispositifs de calcul à topographie mémoire.
PCT/US1999/001387 1998-01-23 1999-01-21 Procede et appareil d'application d'execution ordonnee de lecture et d'ecriture a travers une interface de memoire WO1999038085A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2000528921A JP2002510079A (ja) 1998-01-23 1999-01-21 メモリ・インタフェース間で読み書きの順序付けられた実行を強制する方法と装置
AU23361/99A AU2336199A (en) 1998-01-23 1999-01-21 Method and apparatus for enforcing ordered execution of reads and writes across a memory interface
EP99903307A EP1047996A1 (fr) 1998-01-23 1999-01-21 Procede et appareil d'application d'execution ordonnee de lecture et d'ecriture a travers une interface de memoire

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/012,882 US6038646A (en) 1998-01-23 1998-01-23 Method and apparatus for enforcing ordered execution of reads and writes across a memory interface
US09/012,882 1998-01-23

Publications (1)

Publication Number Publication Date
WO1999038085A1 true WO1999038085A1 (fr) 1999-07-29

Family

ID=21757199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/001387 WO1999038085A1 (fr) 1998-01-23 1999-01-21 Procede et appareil d'application d'execution ordonnee de lecture et d'ecriture a travers une interface de memoire

Country Status (5)

Country Link
US (1) US6038646A (fr)
EP (1) EP1047996A1 (fr)
JP (1) JP2002510079A (fr)
AU (1) AU2336199A (fr)
WO (1) WO1999038085A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001033363A3 (fr) * 1999-11-02 2001-12-13 Siemens Ag Systeme de bus permettant le traitement simultane de differents acces a la memoire avec des solutions de systeme sur puce
WO2008151101A1 (fr) 2007-06-01 2008-12-11 Qualcomm Incorporated Limites de mémoire dirigées par périphérique
WO2011045555A1 (fr) * 2009-10-13 2011-04-21 Arm Limited Requêtes de transaction barrière à latence réduite dans des interconnexions

Families Citing this family (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140325175A1 (en) * 2013-04-29 2014-10-30 Pact Xpp Technologies Ag Pipeline configuration protocol and configuration unit communication
US6757791B1 (en) * 1999-03-30 2004-06-29 Cisco Technology, Inc. Method and apparatus for reordering packet data units in storage queues for reading and writing memory
US6256713B1 (en) * 1999-04-29 2001-07-03 International Business Machines Corporation Bus optimization with read/write coherence including ordering responsive to collisions
EP1228440B1 (fr) 1999-06-10 2017-04-05 PACT XPP Technologies AG Partionnement de séquences dans des structures cellulaires
US6678810B1 (en) * 1999-12-30 2004-01-13 Intel Corporation MFENCE and LFENCE micro-architectural implementation method and system
US6988154B2 (en) * 2000-03-10 2006-01-17 Arc International Memory interface and method of interfacing between functional entities
US6963967B1 (en) * 2000-06-06 2005-11-08 International Business Machines Corporation System and method for enabling weak consistent storage advantage to a firmly consistent storage architecture
US6826619B1 (en) 2000-08-21 2004-11-30 Intel Corporation Method and apparatus for preventing starvation in a multi-node architecture
US6487643B1 (en) 2000-09-29 2002-11-26 Intel Corporation Method and apparatus for preventing starvation in a multi-node architecture
US8058899B2 (en) 2000-10-06 2011-11-15 Martin Vorbach Logic cell array and bus system
US6772298B2 (en) 2000-12-20 2004-08-03 Intel Corporation Method and apparatus for invalidating a cache line without data return in a multi-node architecture
US6791412B2 (en) * 2000-12-28 2004-09-14 Intel Corporation Differential amplifier output stage
US7234029B2 (en) * 2000-12-28 2007-06-19 Intel Corporation Method and apparatus for reducing memory latency in a cache coherent multi-node architecture
US20020087775A1 (en) * 2000-12-29 2002-07-04 Looi Lily P. Apparatus and method for interrupt delivery
US20020087766A1 (en) * 2000-12-29 2002-07-04 Akhilesh Kumar Method and apparatus to implement a locked-bus transaction
US6721918B2 (en) 2000-12-29 2004-04-13 Intel Corporation Method and apparatus for encoding a bus to minimize simultaneous switching outputs effect
US9552047B2 (en) 2001-03-05 2017-01-24 Pact Xpp Technologies Ag Multiprocessor having runtime adjustable clock and clock dependent power supply
US9411532B2 (en) 2001-09-07 2016-08-09 Pact Xpp Technologies Ag Methods and systems for transferring data between a processing device and external devices
US9436631B2 (en) 2001-03-05 2016-09-06 Pact Xpp Technologies Ag Chip including memory element storing higher level memory data on a page by page basis
US9250908B2 (en) 2001-03-05 2016-02-02 Pact Xpp Technologies Ag Multi-processor bus and cache interconnection system
DE60100363T2 (de) 2001-05-11 2004-05-06 Sospita A/S Sequenznummerierungsmechanismus zur sicherung der ausführungsordnungs-integrietät von untereinander abhängigen smart-card anwendungen
US10031733B2 (en) 2001-06-20 2018-07-24 Scientia Sol Mentis Ag Method for processing data
US6971098B2 (en) 2001-06-27 2005-11-29 Intel Corporation Method and apparatus for managing transaction requests in a multi-node architecture
US9170812B2 (en) 2002-03-21 2015-10-27 Pact Xpp Technologies Ag Data processing system having integrated pipelined array data processor
EP1537486A1 (fr) 2002-09-06 2005-06-08 PACT XPP Technologies AG Structure de sequenceur reconfigurable
US7814488B1 (en) * 2002-09-24 2010-10-12 Oracle America, Inc. Quickly reacquirable locks
US7360069B2 (en) * 2004-01-13 2008-04-15 Hewlett-Packard Development Company, L.P. Systems and methods for executing across at least one memory barrier employing speculative fills
US7243200B2 (en) * 2004-07-15 2007-07-10 International Business Machines Corporation Establishing command order in an out of order DMA command queue
JP4327081B2 (ja) * 2004-12-28 2009-09-09 京セラミタ株式会社 メモリアクセス制御回路
US7613886B2 (en) * 2005-02-08 2009-11-03 Sony Computer Entertainment Inc. Methods and apparatus for synchronizing data access to a local memory in a multi-processor system
US7617343B2 (en) * 2005-03-02 2009-11-10 Qualcomm Incorporated Scalable bus structure
US9026744B2 (en) * 2005-03-23 2015-05-05 Qualcomm Incorporated Enforcing strongly-ordered requests in a weakly-ordered processing
US7500045B2 (en) * 2005-03-23 2009-03-03 Qualcomm Incorporated Minimizing memory barriers when enforcing strongly-ordered requests in a weakly-ordered processing system
US7574565B2 (en) * 2006-01-13 2009-08-11 Hitachi Global Storage Technologies Netherlands B.V. Transforming flush queue command to memory barrier command in disk drive
US7917676B2 (en) * 2006-03-10 2011-03-29 Qualcomm, Incorporated Efficient execution of memory barrier bus commands with order constrained memory accesses
US7818306B2 (en) * 2006-03-24 2010-10-19 International Business Machines Corporation Read-copy-update (RCU) operations with reduced memory barrier usage
US7783817B2 (en) * 2006-08-31 2010-08-24 Qualcomm Incorporated Method and apparatus for conditional broadcast of barrier operations
US8108584B2 (en) * 2008-10-15 2012-01-31 Intel Corporation Use of completer knowledge of memory region ordering requirements to modify transaction attributes
US8055816B2 (en) * 2009-04-09 2011-11-08 Micron Technology, Inc. Memory controllers, memory systems, solid state drives and methods for processing a number of commands
US8417912B2 (en) 2010-09-03 2013-04-09 International Business Machines Corporation Management of low-paging space conditions in an operating system
US8782356B2 (en) 2011-12-09 2014-07-15 Qualcomm Incorporated Auto-ordering of strongly ordered, device, and exclusive transactions across multiple memory regions
US9021228B2 (en) 2013-02-01 2015-04-28 International Business Machines Corporation Managing out-of-order memory command execution from multiple queues while maintaining data coherency
US9594713B2 (en) * 2014-09-12 2017-03-14 Qualcomm Incorporated Bridging strongly ordered write transactions to devices in weakly ordered domains, and related apparatuses, methods, and computer-readable media
US9946492B2 (en) * 2015-10-30 2018-04-17 Arm Limited Controlling persistent writes to non-volatile memory based on persist buffer data and a persist barrier within a sequence of program instructions
US11409530B2 (en) * 2018-08-16 2022-08-09 Arm Limited System, method and apparatus for executing instructions
TWI767175B (zh) * 2019-01-31 2022-06-11 美商萬國商業機器公司 用於處理輸入輸出儲存指令之資料處理系統、方法及電腦程式產品
TWI773959B (zh) 2019-01-31 2022-08-11 美商萬國商業機器公司 用於處理輸入輸出儲存指令之資料處理系統、方法及電腦程式產品

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0679993A2 (fr) * 1994-04-28 1995-11-02 Hewlett-Packard Company Ordinateur possédant des instructions spéciales pour imposer des opérations de chargement et de stockage ordonnées
WO1996030838A1 (fr) * 1995-03-31 1996-10-03 Samsung & Electronic, Co. Ltd. Controleur de memoire qui execute des commandes de lecture et d'ecriture dans le desordre
US5666506A (en) * 1994-10-24 1997-09-09 International Business Machines Corporation Apparatus to dynamically control the out-of-order execution of load/store instructions in a processor capable of dispatchng, issuing and executing multiple instructions in a single processor cycle

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5222237A (en) * 1988-02-02 1993-06-22 Thinking Machines Corporation Apparatus for aligning the operation of a plurality of processors
US6088768A (en) * 1993-12-28 2000-07-11 International Business Machines Corporation Method and system for maintaining cache coherence in a multiprocessor-multicache environment having unordered communication
US5666494A (en) * 1995-03-31 1997-09-09 Samsung Electronics Co., Ltd. Queue management mechanism which allows entries to be processed in any order

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0679993A2 (fr) * 1994-04-28 1995-11-02 Hewlett-Packard Company Ordinateur possédant des instructions spéciales pour imposer des opérations de chargement et de stockage ordonnées
US5666506A (en) * 1994-10-24 1997-09-09 International Business Machines Corporation Apparatus to dynamically control the out-of-order execution of load/store instructions in a processor capable of dispatchng, issuing and executing multiple instructions in a single processor cycle
WO1996030838A1 (fr) * 1995-03-31 1996-10-03 Samsung & Electronic, Co. Ltd. Controleur de memoire qui execute des commandes de lecture et d'ecriture dans le desordre

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001033363A3 (fr) * 1999-11-02 2001-12-13 Siemens Ag Systeme de bus permettant le traitement simultane de differents acces a la memoire avec des solutions de systeme sur puce
WO2008151101A1 (fr) 2007-06-01 2008-12-11 Qualcomm Incorporated Limites de mémoire dirigées par périphérique
US7984202B2 (en) * 2007-06-01 2011-07-19 Qualcomm Incorporated Device directed memory barriers
KR101149622B1 (ko) * 2007-06-01 2012-05-29 콸콤 인코포레이티드 장치 지향 메모리 베리어들
EP2600254A1 (fr) * 2007-06-01 2013-06-05 Qualcomm Incorporated Limites de mémoire dirigées par péripherique
JP2013242876A (ja) * 2007-06-01 2013-12-05 Qualcomm Inc デバイスへ向けられたメモリ・バリア
WO2011045555A1 (fr) * 2009-10-13 2011-04-21 Arm Limited Requêtes de transaction barrière à latence réduite dans des interconnexions
US8607006B2 (en) 2009-10-13 2013-12-10 Arm Limited Barrier transactions in interconnects
US8856408B2 (en) 2009-10-13 2014-10-07 Arm Limited Reduced latency barrier transaction requests in interconnects
US9477623B2 (en) 2009-10-13 2016-10-25 Arm Limited Barrier transactions in interconnects

Also Published As

Publication number Publication date
AU2336199A (en) 1999-08-09
US6038646A (en) 2000-03-14
EP1047996A1 (fr) 2000-11-02
JP2002510079A (ja) 2002-04-02

Similar Documents

Publication Publication Date Title
US6038646A (en) Method and apparatus for enforcing ordered execution of reads and writes across a memory interface
US6816947B1 (en) System and method for memory arbitration
US5398325A (en) Methods and apparatus for improving cache consistency using a single copy of a cache tag memory in multiple processor computer systems
US6920516B2 (en) Anti-starvation interrupt protocol
US6643747B2 (en) Processing requests to efficiently access a limited bandwidth storage area
CN102834813B (zh) 用于多通道高速缓存的更新处理机
KR20000022712A (ko) 노드 상호 접속망 상에서 요구를 예측 방식으로 발행하는 비균일 메모리 액세스 데이터 처리 시스템
US6014721A (en) Method and system for transferring data between buses having differing ordering policies
JP2001117859A (ja) バス制御装置
US5659707A (en) Transfer labeling mechanism for multiple outstanding read requests on a split transaction bus
US6546465B1 (en) Chaining directory reads and writes to reduce DRAM bandwidth in a directory based CC-NUMA protocol
US6591345B1 (en) Method for ensuring maximum bandwidth on accesses to strided vectors in a bank-interleaved cache
US6347349B1 (en) System for determining whether a subsequent transaction may be allowed or must be allowed or must not be allowed to bypass a preceding transaction
JPH0628247A (ja) 動的に再配置されるメモリバンク待ち行列
US6836823B2 (en) Bandwidth enhancement for uncached devices
JPH0282330A (ja) ムーブアウト・システム
US7406554B1 (en) Queue circuit and method for memory arbitration employing same
JPH10260895A (ja) 半導体記憶装置およびそれを用いた計算機システム
WO2024073182A1 (fr) Prédicats pour un traitement en mémoire
US7073004B2 (en) Method and data processing system for microprocessor communication in a cluster-based multi-processor network
US8279886B2 (en) Dataport and methods thereof
JP5058116B2 (ja) ストリーミングidメソッドによるdmac発行メカニズム
US6349370B1 (en) Multiple bus shared memory parallel processor and processing method
US20070005865A1 (en) Enforcing global ordering using an inter-queue ordering mechanism
US20140136796A1 (en) Arithmetic processing device and method for controlling the same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: KR

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 528921

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1999903307

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1999903307

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWR Wipo information: refused in national office

Ref document number: 1999903307

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1999903307

Country of ref document: EP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载