US20060253677A1 - Data access prediction - Google Patents
Data access prediction Download PDFInfo
- Publication number
- US20060253677A1 US20060253677A1 US11/121,309 US12130905A US2006253677A1 US 20060253677 A1 US20060253677 A1 US 20060253677A1 US 12130905 A US12130905 A US 12130905A US 2006253677 A1 US2006253677 A1 US 2006253677A1
- Authority
- US
- United States
- Prior art keywords
- memory access
- instruction
- control value
- prediction signal
- access control
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000012545 processing Methods 0.000 claims abstract description 68
- 238000000034 method Methods 0.000 claims abstract description 27
- 238000005070 sampling Methods 0.000 claims abstract description 15
- 230000003019 stabilising effect Effects 0.000 claims description 9
- 230000001902 propagating effect Effects 0.000 claims description 2
- 230000004044 response Effects 0.000 claims 1
- 230000000644 propagated effect Effects 0.000 abstract description 4
- 238000013459 approach Methods 0.000 abstract description 2
- 230000000630 rising effect Effects 0.000 description 7
- 238000012937 correction Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 5
- 230000006641 stabilisation Effects 0.000 description 3
- 230000003139 buffering effect Effects 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/161—Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement
- G06F13/1626—Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement by reordering requests
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/1642—Handling requests for interconnection or transfer for access to memory bus based on arbitration with request queuing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
- G06F9/3832—Value prediction for operands; operand history buffers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3854—Instruction completion, e.g. retiring, committing or graduating
- G06F9/3858—Result writeback, i.e. updating the architectural state or memory
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
- G11C7/10—Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
- G11C7/1015—Read-write modes for single port memories, i.e. having either a random port or a serial port
- G11C7/1039—Read-write modes for single port memories, i.e. having either a random port or a serial port using pipelining techniques, i.e. using latches between functional memory parts, e.g. row/column decoders, I/O buffers, sense amplifiers
Definitions
- the present invention relates to data access.
- Embodiments of the present invention relate to data access in a data processing apparatus in which signals used to cause a data access to occur may be metastable.
- a series of serially-connecting processing stages are formed. Between each stage of the pipeline a signal-capture element such as a latch or a sense amplifier may be provided into which one or more signal values are stored.
- a signal-capture element such as a latch or a sense amplifier
- each processing stage is responsive to input signals received from preceding processing stages or from elsewhere and generates output signals to be stored in an associated output latch.
- the time taken for the processing logic to complete any processing operations determines the speed at which the data processing apparatus may operate. If the processing logic of the processing stages is able to complete its processing operations in a short period of time, then the signals may rapidly advance through the output latches, resulting in high speed processing. However, the system can not advance signals between stages more rapidly than the speed at which the slowest processing logic in a stage is able to perform its processing operations on received input signals and generate the appropriate output signals. This limits the performance of the system.
- Some known techniques seek to overcome some of these processing speed limitations. For example, it is possible to advance the driving of the processing stages until the slowest processing stage is unable to keep pace. Also, sometimes it is possible to reduce the power consumption of the data processing apparatus and the operating voltage will be reduced up to the point at which the slowest processing stage is no longer able to keep pace. It will be appreciated that in both of these situations processing errors may occur.
- the change of state of the signal during these errors is transient (i.e. it is pulse like) and a reset or a rewrite of the latch or device causes normal behaviour to resume thereafter.
- the signal in this transient state is said to be metastable because it fails to achieve a valid logic level for a period of time, but instead hovers at a metastable voltage somewhere between the logic levels, before transitioning to a valid logic level.
- the structure of a memory is such that both read accesses and write accesses occur using a common address interface. Data should only be written to the cache (known as committing) when the write access has been confirmed to not contain any errors.
- arbitration techniques are provided in order to deal with the occurrence of concurrent read and write access over the common buses, with read accesses being given priority over write accesses. Accordingly, read accesses are performed in preference, with write accesses being placed in the write buffer and postponed until after the write access is confirmed to be error free and no read accesses are outstanding.
- a method of accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable signals may occur on at least the boundaries of the pipelined stages, the method comprising the steps of: receiving an indication that an instruction is to be processed by the pipelined data processing apparatus; generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby preventing any metastability in the predicted memory access control value; and in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory.
- the present invention recognises that a problem exists whereby the signals used in a read access may be metastable and that this may cause metastable signals to be used directly in the arbitration of data accesses. This in turn can result in many different types of errors occurring when accessing data. In an extreme case, these errors may cause the data to become corrupted. It will be appreciated that corrupting data is undesirable at the best of times; however, data corruption due to metastability is particularly disadvantageous since it will be almost impossible to determine the corruption occurred since it is extremely unlikely that the status of the signals causing the corruption can be determined.
- the present invention recognises that the metastable signals may be propagated from stage to stage. For example, in arrangements where single cycle fetch and single cycle decode stages are provided, but more than one cycle is required to determine whether signals are metastable, preventing propagation of metastable signals into, for example an execute stage, cannot easily be controlled without postponing the execution of the data access itself.
- write accesses are postponed by buffering until it is ensured that the write access is valid. Buffering the write access does not adversely affect throughput since the write access will rarely be on the critical path.
- the present invention also recognises whilst it may be possible to postpone read accesses, those read accesses are typically on the critical path and any delay in performing the read access will cause instructions in the pipeline to be stalled thereby significantly reducing the throughput of the data processing apparatus.
- an indication that an instruction is to be processed by the pipelined data processing apparatus is received and a memory access prediction signal is then generated.
- the memory access prediction signal has a value indicative of whether or not the instruction is likely to cause a read access from a memory. Hence, an indication is provided when the instruction is likely to cause a read access.
- a predicted memory access control signal is generated from the memory access prediction signal.
- the predicted memory access control signal is generated in a way which prevents any metastability being present in that signal. This is achieved by the predicted memory access control signal achieving and maintaining a valid logic level for at least a sampling period. A read access can then be initiated in the event that it is predicted that a read access is likely to occur.
- a signal used to initiate a read access can be generated in a way which ensures that it will have no metastability. This is possible because that signal is merely a prediction signal rather than the decoded instruction itself and, hence, can be generated much earlier in the pipeline. Because the prediction signal is generated much earlier in the pipeline, it can be ensured that the signal used to cause the memory access has no metastability.
- the signals used in a read access are prevented from being metastable which removes the possibility that metastable signals are used directly in the arbitration of data accesses. Also, the metastable signals may be prevented from being propagated from stage to stage.
- the step of generating the memory access prediction signal comprises the steps of: determining a program counter value associated with the instruction to be processed; and referencing a lookup table to provide the value indicative of whether or not the instruction associated with that program counter value is likely to cause a read access from the memory; and propagating the value provided by the lookup table as the memory access prediction signal.
- the step of determining the program counter value occurs when processing the instruction during a fetch stage of the pipelined processor.
- the method further comprises the step of: storing in the lookup table the value indicative of whether or not the instructions associated with program counter values are likely to cause read accesses from the memory.
- the step of generating the predicted memory access control value comprises the steps of: passing the memory access prediction signal through a synchronising structure to generate the predicted memory access control value having a valid logic level.
- the step of generating the predicted memory access control value comprises the steps of: passing the memory access prediction signal through a pair of latches, each latch being clocked to coincide with the passing of the instruction between subsequent boundaries of the pipelined stages.
- the step of generating the predicted memory access control value comprises the steps of: passing the memory access prediction signal to an input of a first latch; providing an intermediate signal on the output of the first latch as the instruction passes between first and second pipelined stages; passing the intermediate signal to an input of a second latch; and providing the predicted memory access control value on the output of the second latch as the instruction passes between second and third pipelined stages.
- the first, second and third pipelined stages comprise fetch, decode and execute pipelined stages.
- the step of passing the memory access prediction signal through a pair of latches causes the predicted memory access control value to have timing characteristics which achieve a valid logic level prior to a setup period prior to a sampling clock transitioning, said valid logic level being held during a hold period following said sampling clock transitioning.
- the step of causing the read access to be initiated from the memory occurs when the associated instruction is being executed in the execute pipelined stage.
- the read access is initiated at the appropriate stage in the pipeline, but using the memory access prediction signal which is assured to not be metastable.
- the step of generating the memory access prediction signal further includes the step of: generating a timing value indicative of when the associated instruction is likely to be executed in the execute pipelined stage and the step of generating the predicted memory access control value from the memory access prediction signal is responsive to the timing value such that the predicted memory access control value is provided for at least a period in which the associated instruction is likely to be executed in the execute pipelined stage.
- the method further comprises the steps of: processing the instruction in the pipelined stages, the instruction causing an actual memory access signal to be generated; in the event the actual memory access signal has a value indicating a read access from the memory is to occur and the predicted memory access control value indicates that a read access is not likely to occur, causing the execution of the instruction to be stalled whilst an actual memory access control value is generated from the actual memory access signal, the actual memory access control value being generated to have a valid logic level thereby removing any metastability in the actual memory access signal value, and in the event that the actual memory access control value indicates that a read access is to occur, causing a read access to be initiated from the memory.
- the instruction is stalled until the actual memory access control value is cleaned to remove any metastability in the same way as the memory access prediction signal was and in the event that the actual memory access control value indicates that a read access is to occur, a read access is initiated from the memory.
- resultant actual memory access control value may be used to update the lookup table.
- an integrated circuit operable to access data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable values may occur on at least the boundaries of the pipelined stages
- the integrated circuit comprising: a read access prediction circuit operable to receive an indication that an instruction is to be processed by the pipelined data processing apparatus, the read access prediction circuit being further operable to generate a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; a prediction signal stabilising circuit operable to generate a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby removing any metastability in the memory access prediction signal value; and a memory access circuit operable, in the event that the predicted memory access control value indicates that a read access is likely to occur, to cause a read access to be initiated from the memory.
- an integrated circuit for accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable values may occur on at least the boundaries of the pipelined stages
- the integrated circuit comprising: read access prediction means for receiving an indication that an instruction is to be processed by the pipelined data processing apparatus and for generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; prediction signal stabilising means for generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby removing any metastability in the memory access prediction signal value; and memory access means for, in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory.
- FIG. 1 illustrates a data processing apparatus according to an embodiment of the present invention
- FIG. 2 is a timing diagram illustrating the operation of the read access prediction logic and the misprediction logic of FIG. 1 ;
- FIG. 3 is a flow chart illustrating the read access prediction technique in more detail performed by the data processing apparatus of FIG. 1 .
- FIG. 1 illustrates a data processing apparatus, generally 10 , according to an embodiment of the present invention.
- the data processing apparatus 10 comprises a processor core 20 coupled with a data RAM 30 and an error detection/correction unit 40 .
- the processor core 20 is operable to process instructions and data retrieved from a main memory (not shown).
- the data RAM 30 is arranged to store data so that it is subsequently readily accessible by the processor core 20 .
- the data RAM 30 will store the data values associated with the memory address until it is overwritten by a data value for a new memory address required by the processor core 20 .
- the data values are stored in the data RAM 30 using either physical or virtual memory addresses. Well known cache allocation policies may be used when reading or writing data values to the data RAM 30 .
- the error detection/correction unit 40 is operable to determine whether any errors occur during the processing of instructions. For example, the error detection/correction unit 40 will, at a system level, determine whether any timing violations have occurred in any of the signals used in the processing of data and, whether any metastability may have resulted.
- the error detection/correction unit 40 will initiate the appropriate corrective measures in order to prevent incorrect operation of the data processing apparatus 10 .
- the operation of the data processing apparatus 10 may be reset or restarted from a safe position.
- the processor core 20 comprises a pipeline 90 coupled with write logic 50 , read access prediction logic 70 , misprediction logic 80 and cache interface logic 60 .
- the write logic 50 comprises a store buffer 100 operable to store data values which have been indicated as being required to be allocated to the data RAM 30 and commit logic 110 which determines when data values stored in the store buffer 100 are available for storing in the data RAM 30 .
- the store buffer 100 comprises a first-in first-out buffer which receives data values from a write-back stage 240 of the pipeline 90 .
- Data values to be placed in the store buffer 100 are qualified by stabilisation stages (not shown) which are provided between the write-back stage 240 and the store buffer 100 .
- the stabilisation stages store the data values therein for a predetermined number of clock cycles. Once the predetermined number of clock cycles (in this example two clock cycles) has passed then the data value is stored in the store buffer 100 and will be available to the commit logic 110 for allocation to the data RAM 30 . In this way, it can be ensured that any of the data values or signals used to write to the data RAM 30 have no metastability and, hence, no errors will occur in the data being written to the data RAM 30 .
- the commit logic 110 When the commit logic 110 receives data values from the store buffer 100 to be stored in the data RAM 30 , the commit logic 110 provides a number of signals to the cache interface logic 60 . These signals indicate whether data values are now available to be written to the data RAM 30 (W_VALID), the address associated with that data (W_ADD) and the data values themselves (W_DATA).
- the W_VALID signal and the output from the OR gate 180 are provided to an OR gate 112 . Should the W_VALID signal or the output from the OR gate 180 be set (indicating that either a write or a read access is to occur) then the Chip Select input of the data RAM 30 will be set.
- the W_VALID signal is provided to an AND gate 114 and the output from the OR gate 180 are provided to an inverting input of the AND gate 114 .
- the write/read input of the data RAM 30 will be set to indicate that a write access should occur; otherwise the write/read input of the data RAM 30 will be cleared to indicate that a read access should occur.
- the output of the OR gate 180 is provided to a multiplexer 116 to select either a write address provided by the commit logic 110 or a read address provided by the execute stage 220 depending on whether a write or a read access is to occur.
- the read access prediction logic 70 receives from the fetch stage 200 the value of the program counter associated with the instruction being fetched by the fetch stage 200 .
- the value of the program counter is provided to a read prediction circuit 120 .
- the read prediction circuit 120 stores historic information indicating whether an instruction associated with that program counter value resulted in a read access to the data RAM 30 .
- a prediction signal PREDICT_FE is asserted over the path 125 to an input of a first latch 130 . Otherwise, no signal is asserted to the first latch 130 .
- the PREDICT_FE signal is clocked through the first latch 130 and provided as a predict signal PREDICT_DE to the input of a second latch 140 .
- the second latch 140 On the rising edge of the next clock cycle, the second latch 140 outputs a signal PREDICT_EX to the misprediction logic 80 .
- the instruction which was used to by the load prediction circuit 120 to generate the prediction signal also passes through the pipeline 90 .
- the execute stage 220 will have generated an ACTUAL_EX signal which indicates whether the instruction appears to have resulted in a read access being required (it will be appreciated that the ACTUAL_EX signal may be metastable and so it is not certain the a read access will be required).
- the PREDICT_EX and the ACTUAL_EX signal may be compared.
- the PREDICT_EX signal can be used to directly drive the cache interface logic 60 to cause a read of the data RAM 30 .
- the signal used to cause a read from the data RAM 30 can be assured not to be metastable. This prevents many different types of errors from occurring when accessing data in the data RAM 30 and also helps to ensure that the data values in the data RAM 30 cannot become corrupted as a result of the read access.
- the misprediction logic 80 is used to resolve this conflict.
- the misprediction logic 80 comprises an AND gate 150 , a first latch 160 , a second latch 170 , an OR gate 180 and stall logic 190 .
- the PREDICT_EX signal is received at an inverting input of the AND gate 150 with the other non-inverting input receiving the ACTUAL_EX signal.
- the output of the OR gate 180 will be low, which will not cause a read access to be initiated in the data RAM 30 .
- the output of the AND gate 150 will be asserted which will cause the stall logic 190 to cause the memory and all earlier pipelined stages to stall.
- the signal provided to the first latch 160 will be output to the second latch 170 .
- the output of the second latch 170 will be provided to the stall logic and the OR gate 180 .
- any metastability in that signal is resolved, enabling the output of latch 170 to be able to be used to initiate a data cache access in the event of a misprediction.
- the memory execute stage 230 will be stalled for two cycles. In this way, in the event that the ACTUAL_EX signal is resolved at the output of the latch 170 to cause a read access from the data RAM 30 then the OR gate 180 will assert an output which causes the cache interface logic 60 to access the data from the data RAM 30 . Because this signal has also been delayed for two cycles it can be ensured that the signal driving the cache access is also not metastable.
- ACTUAL_EX signal a resolved version of the ACTUAL_EX signal can be used in order to update the load prediction circuit 120 with details of whether a read access did or did not actually need to occur for that instruction having that program counter value.
- the ACTUAL_EX signal provided by the execute stage 220 may be provided via an alternative stabilisation structure in order to update the load prediction circuit 120 .
- the load prediction circuit 120 may be updated when either a read access occurred, but a read access was not predicted or a read access did not occur, but a read access was predicted.
- FIG. 2 illustrates the operation of the read access prediction logic 70 and the misprediction logic 80 in more detail.
- the program counter is used to generate the instruction fetch address.
- the value of the program counter is latched into the fetch-stage 200 and provided to the load prediction circuit 120 .
- the PREDICT_FE signal is determined, based on the value of the program counter.
- the output of the load prediction circuit 120 is sampled by the first latch 130 and provided as the PREDICT_DE signal.
- the PREDICT_DE signal is sampled by the second latch 140 .
- the output of the second latch 140 is provided as the PREDICT_EX signal to the misprediction logic 80 .
- the instruction has reached the execute stage 220 and the ACTUAL_EX signal is also presented to the misprediction logic 80 .
- the PREDICT_EX signal indicates that a read access should occur, the read access will occur and if the ACTUAL_EX signal resolves to indicate that a read access should not occur then the read data will be discarded.
- FIG. 3 illustrates the read access prediction technique in more detail.
- the cache access is determined using the PREDICT_EX signal.
- step s 20 it is determined that the PREDICT_EX signal and the resolved ACTUAL_EX signal are identical then processing proceeds back to step s 10 .
- step s 30 it is determined whether a read was predicted by the PREDICT_EX signal but that the resolved ACTUAL_EX signal did not indicate that a read was required.
- the read data value is discarded and at step s 50 the history information associated with the read access prediction circuit 120 is updated to indicate that the instruction associated with that program counter value is not predicted to result in a read access.
- step s 30 it is determined that the PREDICT_EX signal indicates that a read access was not predicted but that the ACTUAL_EX signal indicates that a read should occur then, at step s 60 , the pipeline will be stalled.
- step s 70 the ACTUAL_EX signal will be resolved, and in the event that the ACTUAL_EX signal continues to indicate that a read access should occur then the history information associated with the load prediction circuit will be updated to indicate that a read access should occur for the instruction associated with that program counter value.
- the requested data value will be read from the data RAM 30 .
- step s 90 the stall on the pipeline 90 will be removed.
- the memory access prediction signal has a value indicative of whether or not the instruction is likely to cause a read access from a memory.
- the predicted memory access control signal is generated in a way which prevents any metastability being present in that signal. This is achieved by the predicted memory access control signal achieving and maintaining a valid logic level for at least a sampling period.
- a read access can then be initiated in the event that it is predicted that a read access is likely to occur.
- the signals used in a read access are prevented from being metastable which removes the possibility that metastable signals are used directly in the arbitration of data accesses. Also, the metastable signals may be prevented from being propagated from stage to stage.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
A method and integrated circuit for accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable signals may occur on at least the boundaries of the pipelined stages is disclosed. The method comprises the steps of: receiving an indication that an instruction is to be processed by the pipelined data processing apparatus; generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby preventing any metastability in the predicted memory access control value; and in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory. Through this approach, an indication that an instruction is to be processed by the pipelined data processing apparatus is received and a memory access prediction signal indicative of whether or not the instruction is likely to cause a read access from a memory is then generated. The predicted memory access control signal is generated in a way which prevents any metastability being present in that signal. Hence, the signals used in a read access are prevented from being metastable which removes the possibility that metastable signals are used directly in the arbitration of data accesses. Also, the metastable signals may be prevented from being propagated from stage to stage.
Description
- The present invention relates to data access. Embodiments of the present invention relate to data access in a data processing apparatus in which signals used to cause a data access to occur may be metastable.
- In a data processing apparatus, such as a pipelined data processing apparatus, a series of serially-connecting processing stages are formed. Between each stage of the pipeline a signal-capture element such as a latch or a sense amplifier may be provided into which one or more signal values are stored.
- The logic of each processing stage is responsive to input signals received from preceding processing stages or from elsewhere and generates output signals to be stored in an associated output latch. In a typical pipelined data processing apparatus, the time taken for the processing logic to complete any processing operations determines the speed at which the data processing apparatus may operate. If the processing logic of the processing stages is able to complete its processing operations in a short period of time, then the signals may rapidly advance through the output latches, resulting in high speed processing. However, the system can not advance signals between stages more rapidly than the speed at which the slowest processing logic in a stage is able to perform its processing operations on received input signals and generate the appropriate output signals. This limits the performance of the system.
- Some known techniques seek to overcome some of these processing speed limitations. For example, it is possible to advance the driving of the processing stages until the slowest processing stage is unable to keep pace. Also, sometimes it is possible to reduce the power consumption of the data processing apparatus and the operating voltage will be reduced up to the point at which the slowest processing stage is no longer able to keep pace. It will be appreciated that in both of these situations processing errors may occur.
- These processing errors occur typically because the output signal to be stored in the associated output latch does not achieve a predetermined stable voltage level for a period of time prior to a clock signal being provided to the latch (known as the set-up period) or that the output signal is not held for a predetermined period after the clock signal is provided to the output latch (known as the hold period).
- The change of state of the signal during these errors is transient (i.e. it is pulse like) and a reset or a rewrite of the latch or device causes normal behaviour to resume thereafter. The signal in this transient state is said to be metastable because it fails to achieve a valid logic level for a period of time, but instead hovers at a metastable voltage somewhere between the logic levels, before transitioning to a valid logic level.
- In a data processing apparatus which has a memory, it is desirable to perform accesses to that memory as quickly as possible since this has an obvious beneficial effect on processor throughput.
- The structure of a memory, such as a single-ported cache, is such that both read accesses and write accesses occur using a common address interface. Data should only be written to the cache (known as committing) when the write access has been confirmed to not contain any errors.
- In the case of a write access, if it transpires that the write access is in some way incorrect or invalid then the data stored in the memory may be corrupt. Furthermore, should the signals used in a write access be metastable then the data stored in the memory may be corrupt. These problems can be overcome by adding extra stages to the processing logic which can detect that such an error has occurred due to the presence of this metastability. The metastability determination can then be made prior to the data being committed to memory. The metastability determination is typically performed at system level and takes a number of processing cycles. Hence, the write access may be buffered in a write buffer and only committed some cycles later when it is known that no errors have occurred. It will be appreciated that such an arrangement has a minimal impact on throughput since write accesses with rarely be on the critical path.
- However, it is desirable to execute read accesses as soon as possible. This is because read accesses will typically be on the critical path and any latency in executing read accesses will have a detrimental effect on throughput. Accordingly, the pipelined stages prior to the execution stages are typically optimised to process read accesses as quickly as possible. Accordingly, typical fetch and decode stages would normally be optimised to fetch a read access instruction in a single processing cycle and then decode that instruction in a subsequent single processing cycle. This ensures that the execution of the read access can occur at an early stage.
- Also, arbitration techniques are provided in order to deal with the occurrence of concurrent read and write access over the common buses, with read accesses being given priority over write accesses. Accordingly, read accesses are performed in preference, with write accesses being placed in the write buffer and postponed until after the write access is confirmed to be error free and no read accesses are outstanding.
- It is desired to provide improved techniques for performing data accesses.
- According to one aspect of the present invention there is provided a method of accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable signals may occur on at least the boundaries of the pipelined stages, the method comprising the steps of: receiving an indication that an instruction is to be processed by the pipelined data processing apparatus; generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby preventing any metastability in the predicted memory access control value; and in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory.
- The present invention recognises that a problem exists whereby the signals used in a read access may be metastable and that this may cause metastable signals to be used directly in the arbitration of data accesses. This in turn can result in many different types of errors occurring when accessing data. In an extreme case, these errors may cause the data to become corrupted. It will be appreciated that corrupting data is undesirable at the best of times; however, data corruption due to metastability is particularly disadvantageous since it will be almost impossible to determine the corruption occurred since it is extremely unlikely that the status of the signals causing the corruption can be determined.
- Also, the present invention recognises that the metastable signals may be propagated from stage to stage. For example, in arrangements where single cycle fetch and single cycle decode stages are provided, but more than one cycle is required to determine whether signals are metastable, preventing propagation of metastable signals into, for example an execute stage, cannot easily be controlled without postponing the execution of the data access itself.
- As mentioned previously, write accesses are postponed by buffering until it is ensured that the write access is valid. Buffering the write access does not adversely affect throughput since the write access will rarely be on the critical path. However, the present invention also recognises whilst it may be possible to postpone read accesses, those read accesses are typically on the critical path and any delay in performing the read access will cause instructions in the pipeline to be stalled thereby significantly reducing the throughput of the data processing apparatus.
- Accordingly, an indication that an instruction is to be processed by the pipelined data processing apparatus is received and a memory access prediction signal is then generated. The memory access prediction signal has a value indicative of whether or not the instruction is likely to cause a read access from a memory. Hence, an indication is provided when the instruction is likely to cause a read access. A predicted memory access control signal is generated from the memory access prediction signal.
- The predicted memory access control signal is generated in a way which prevents any metastability being present in that signal. This is achieved by the predicted memory access control signal achieving and maintaining a valid logic level for at least a sampling period. A read access can then be initiated in the event that it is predicted that a read access is likely to occur.
- In this way, a signal used to initiate a read access can be generated in a way which ensures that it will have no metastability. This is possible because that signal is merely a prediction signal rather than the decoded instruction itself and, hence, can be generated much earlier in the pipeline. Because the prediction signal is generated much earlier in the pipeline, it can be ensured that the signal used to cause the memory access has no metastability.
- Hence, the signals used in a read access are prevented from being metastable which removes the possibility that metastable signals are used directly in the arbitration of data accesses. Also, the metastable signals may be prevented from being propagated from stage to stage.
- In embodiments, the step of generating the memory access prediction signal comprises the steps of: determining a program counter value associated with the instruction to be processed; and referencing a lookup table to provide the value indicative of whether or not the instruction associated with that program counter value is likely to cause a read access from the memory; and propagating the value provided by the lookup table as the memory access prediction signal.
- By referencing a lookup table, a rapid determination can be made of whether the instruction associated with program counter value is anticipated to cause a read access to occur.
- In embodiments, the step of determining the program counter value occurs when processing the instruction during a fetch stage of the pipelined processor.
- By making the determination early in the pipeline, sufficient time is provided to enable the predicted memory access control value to achieve a non-metastable condition by the time that that signal needs to be used to cause the read access to occur.
- In embodiments, the method further comprises the step of: storing in the lookup table the value indicative of whether or not the instructions associated with program counter values are likely to cause read accesses from the memory.
- In embodiments, the step of generating the predicted memory access control value comprises the steps of: passing the memory access prediction signal through a synchronising structure to generate the predicted memory access control value having a valid logic level.
- In embodiments, the step of generating the predicted memory access control value comprises the steps of: passing the memory access prediction signal through a pair of latches, each latch being clocked to coincide with the passing of the instruction between subsequent boundaries of the pipelined stages.
- By passing the memory access prediction signal through the pair of latches ensures that the resultant signal will have no metastability.
- In embodiments, the step of generating the predicted memory access control value comprises the steps of: passing the memory access prediction signal to an input of a first latch; providing an intermediate signal on the output of the first latch as the instruction passes between first and second pipelined stages; passing the intermediate signal to an input of a second latch; and providing the predicted memory access control value on the output of the second latch as the instruction passes between second and third pipelined stages.
- In embodiments, the first, second and third pipelined stages comprise fetch, decode and execute pipelined stages.
- In embodiments, the step of passing the memory access prediction signal through a pair of latches causes the predicted memory access control value to have timing characteristics which achieve a valid logic level prior to a setup period prior to a sampling clock transitioning, said valid logic level being held during a hold period following said sampling clock transitioning.
- In embodiments, in the event that the predicted memory access control value indicates that a read access is likely to occur, the step of causing the read access to be initiated from the memory occurs when the associated instruction is being executed in the execute pipelined stage.
- Hence, the read access is initiated at the appropriate stage in the pipeline, but using the memory access prediction signal which is assured to not be metastable.
- In embodiments, the step of generating the memory access prediction signal further includes the step of: generating a timing value indicative of when the associated instruction is likely to be executed in the execute pipelined stage and the step of generating the predicted memory access control value from the memory access prediction signal is responsive to the timing value such that the predicted memory access control value is provided for at least a period in which the associated instruction is likely to be executed in the execute pipelined stage.
- In embodiments, the method further comprises the steps of: processing the instruction in the pipelined stages, the instruction causing an actual memory access signal to be generated; in the event the actual memory access signal has a value indicating a read access from the memory is to occur and the predicted memory access control value indicates that a read access is not likely to occur, causing the execution of the instruction to be stalled whilst an actual memory access control value is generated from the actual memory access signal, the actual memory access control value being generated to have a valid logic level thereby removing any metastability in the actual memory access signal value, and in the event that the actual memory access control value indicates that a read access is to occur, causing a read access to be initiated from the memory.
- Hence, should the memory access prediction signal not predict that a memory access should occur then the instruction is stalled until the actual memory access control value is cleaned to remove any metastability in the same way as the memory access prediction signal was and in the event that the actual memory access control value indicates that a read access is to occur, a read access is initiated from the memory.
- It will be appreciated that the resultant actual memory access control value may be used to update the lookup table.
- According to a second aspect of the present invention there is provided an integrated circuit operable to access data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable values may occur on at least the boundaries of the pipelined stages, the integrated circuit comprising: a read access prediction circuit operable to receive an indication that an instruction is to be processed by the pipelined data processing apparatus, the read access prediction circuit being further operable to generate a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; a prediction signal stabilising circuit operable to generate a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby removing any metastability in the memory access prediction signal value; and a memory access circuit operable, in the event that the predicted memory access control value indicates that a read access is likely to occur, to cause a read access to be initiated from the memory.
- According to a second aspect of the present invention there is provided an integrated circuit for accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable values may occur on at least the boundaries of the pipelined stages, the integrated circuit comprising: read access prediction means for receiving an indication that an instruction is to be processed by the pipelined data processing apparatus and for generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory; prediction signal stabilising means for generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby removing any metastability in the memory access prediction signal value; and memory access means for, in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory.
- Embodiments of the present invention will now be described with reference to the accompanying drawings in which
-
FIG. 1 illustrates a data processing apparatus according to an embodiment of the present invention; -
FIG. 2 is a timing diagram illustrating the operation of the read access prediction logic and the misprediction logic ofFIG. 1 ; and -
FIG. 3 is a flow chart illustrating the read access prediction technique in more detail performed by the data processing apparatus ofFIG. 1 . -
FIG. 1 illustrates a data processing apparatus, generally 10, according to an embodiment of the present invention. Thedata processing apparatus 10 comprises aprocessor core 20 coupled with adata RAM 30 and an error detection/correction unit 40. - The
processor core 20 is operable to process instructions and data retrieved from a main memory (not shown). Thedata RAM 30 is arranged to store data so that it is subsequently readily accessible by theprocessor core 20. Thedata RAM 30 will store the data values associated with the memory address until it is overwritten by a data value for a new memory address required by theprocessor core 20. The data values are stored in thedata RAM 30 using either physical or virtual memory addresses. Well known cache allocation policies may be used when reading or writing data values to thedata RAM 30. - Coupled with the
processor core 20 is an error detection/correction unit 40. The error detection/correction unit 40 is operable to determine whether any errors occur during the processing of instructions. For example, the error detection/correction unit 40 will, at a system level, determine whether any timing violations have occurred in any of the signals used in the processing of data and, whether any metastability may have resulted. - In the event that it is determined that metastability might have occurred then the error detection/
correction unit 40 will initiate the appropriate corrective measures in order to prevent incorrect operation of thedata processing apparatus 10. For example, in the event that an error is detected, the operation of thedata processing apparatus 10 may be reset or restarted from a safe position. - The
processor core 20 comprises apipeline 90 coupled withwrite logic 50, readaccess prediction logic 70,misprediction logic 80 andcache interface logic 60. - The
write logic 50 comprises astore buffer 100 operable to store data values which have been indicated as being required to be allocated to thedata RAM 30 and commitlogic 110 which determines when data values stored in thestore buffer 100 are available for storing in thedata RAM 30. - The
store buffer 100 comprises a first-in first-out buffer which receives data values from a write-back stage 240 of thepipeline 90. Data values to be placed in thestore buffer 100 are qualified by stabilisation stages (not shown) which are provided between the write-back stage 240 and thestore buffer 100. The stabilisation stages store the data values therein for a predetermined number of clock cycles. Once the predetermined number of clock cycles (in this example two clock cycles) has passed then the data value is stored in thestore buffer 100 and will be available to the commitlogic 110 for allocation to thedata RAM 30. In this way, it can be ensured that any of the data values or signals used to write to thedata RAM 30 have no metastability and, hence, no errors will occur in the data being written to thedata RAM 30. - When the commit
logic 110 receives data values from thestore buffer 100 to be stored in thedata RAM 30, the commitlogic 110 provides a number of signals to thecache interface logic 60. These signals indicate whether data values are now available to be written to the data RAM 30 (W_VALID), the address associated with that data (W_ADD) and the data values themselves (W_DATA). - The W_VALID signal and the output from the
OR gate 180 are provided to anOR gate 112. Should the W_VALID signal or the output from theOR gate 180 be set (indicating that either a write or a read access is to occur) then the Chip Select input of thedata RAM 30 will be set. The W_VALID signal is provided to an ANDgate 114 and the output from theOR gate 180 are provided to an inverting input of the ANDgate 114. Should the W_VALID signal be set (indicating a write access is requested) and the output of theOR gate 180 is cleared (indicating that no read write access is requested) then the write/read input of thedata RAM 30 will be set to indicate that a write access should occur; otherwise the write/read input of thedata RAM 30 will be cleared to indicate that a read access should occur. The output of theOR gate 180 is provided to amultiplexer 116 to select either a write address provided by the commitlogic 110 or a read address provided by the executestage 220 depending on whether a write or a read access is to occur. - Coupled with the fetch
stage 200 of thepipeline 90 is the readaccess prediction logic 70. The readaccess prediction logic 70 receives from the fetchstage 200 the value of the program counter associated with the instruction being fetched by the fetchstage 200. - The value of the program counter is provided to a
read prediction circuit 120. Theread prediction circuit 120 stores historic information indicating whether an instruction associated with that program counter value resulted in a read access to thedata RAM 30. - In the event that the
read prediction circuit 120 indicates that the program counter address is likely to be associated with a read access to thedata RAM 30 then a prediction signal PREDICT_FE is asserted over thepath 125 to an input of afirst latch 130. Otherwise, no signal is asserted to thefirst latch 130. - On the rising edge of the next clock cycle, the PREDICT_FE signal is clocked through the
first latch 130 and provided as a predict signal PREDICT_DE to the input of asecond latch 140. - On the rising edge of the next clock cycle, the
second latch 140 outputs a signal PREDICT_EX to themisprediction logic 80. - In this way, it will be appreciated that a simple prediction can be made as to whether the instruction being fetched will likely result in a read access to the
data RAM 30 occurring. In the event that theload prediction circuit 120 indicates that a read access will occur then two cycles will have passed by the time this prediction signal has reached the executestage 220, having been clocked through two latches. Accordingly, the prediction signal PREDICT_EX can be guaranteed to not be metastable. - As the prediction signal is being clocked through the latches, the instruction which was used to by the
load prediction circuit 120 to generate the prediction signal also passes through thepipeline 90. By the time that the PREDICT_EX signal reaches themisprediction logic 80, the executestage 220 will have generated an ACTUAL_EX signal which indicates whether the instruction appears to have resulted in a read access being required (it will be appreciated that the ACTUAL_EX signal may be metastable and so it is not certain the a read access will be required). - Accordingly, the PREDICT_EX and the ACTUAL_EX signal may be compared.
- In the event that the ACTUAL_EX signal provided by the execute
stage 220 to themisprediction logic 80 and the PREDICT_EX signal indicates that a read access is to occur then the PREDICT_EX signal can be used to directly drive thecache interface logic 60 to cause a read of thedata RAM 30. In this way, it will be appreciated that the signal used to cause a read from thedata RAM 30 can be assured not to be metastable. This prevents many different types of errors from occurring when accessing data in thedata RAM 30 and also helps to ensure that the data values in thedata RAM 30 cannot become corrupted as a result of the read access. - In the event that the ACTUAL_EX signal indicates that a cache read should occur but the PREDICT_EX signal indicates that a cache read should not occur then the
misprediction logic 80 is used to resolve this conflict. - The
misprediction logic 80 comprises an ANDgate 150, afirst latch 160, asecond latch 170, an ORgate 180 and stalllogic 190. - The PREDICT_EX signal is received at an inverting input of the AND
gate 150 with the other non-inverting input receiving the ACTUAL_EX signal. - Accordingly, in the event that the PREDICT_EX signal does not indicate that a read access will occur then the output of the
OR gate 180 will be low, which will not cause a read access to be initiated in thedata RAM 30. - However, in the event that the ACTUAL_EX signal indicates a read access should occur then the output of the AND
gate 150 will be asserted which will cause thestall logic 190 to cause the memory and all earlier pipelined stages to stall. - On the next rising edge of the clock signal, the signal provided to the
first latch 160 will be output to thesecond latch 170. - On the next rising edge of the clock signal the output of the
second latch 170 will be provided to the stall logic and theOR gate 180. - By passing the output of the AND
gate 150 through the synchronising structure consisting of thefirst latch 160 and thesecond latch 170, any metastability in that signal is resolved, enabling the output oflatch 170 to be able to be used to initiate a data cache access in the event of a misprediction. - Also, the memory execute
stage 230 will be stalled for two cycles. In this way, in the event that the ACTUAL_EX signal is resolved at the output of thelatch 170 to cause a read access from thedata RAM 30 then theOR gate 180 will assert an output which causes thecache interface logic 60 to access the data from thedata RAM 30. Because this signal has also been delayed for two cycles it can be ensured that the signal driving the cache access is also not metastable. - In the event that the PREDICT_EX signal and the ACTUAL_EX signal differ, a resolved version of the ACTUAL_EX signal can be used in order to update the
load prediction circuit 120 with details of whether a read access did or did not actually need to occur for that instruction having that program counter value. Alternatively, the ACTUAL_EX signal provided by the executestage 220 may be provided via an alternative stabilisation structure in order to update theload prediction circuit 120. Hence, theload prediction circuit 120 may be updated when either a read access occurred, but a read access was not predicted or a read access did not occur, but a read access was predicted. - Should it transpire that the PREDICT_EX signal used to cause a read access to occur was incorrect then the data value which was read may be simply discarded.
-
FIG. 2 illustrates the operation of the readaccess prediction logic 70 and themisprediction logic 80 in more detail. - During
clock cycle 0, the program counter is used to generate the instruction fetch address. - On the rising edge of
clock cycle 1, the value of the program counter is latched into the fetch-stage 200 and provided to theload prediction circuit 120. Duringclock cycle 1, the PREDICT_FE signal is determined, based on the value of the program counter. - On the rising edge of
clock cycle 2, the output of theload prediction circuit 120 is sampled by thefirst latch 130 and provided as the PREDICT_DE signal. - On the rising edge of
clock cycle 3, the PREDICT_DE signal is sampled by thesecond latch 140. Duringclock cycle 3, the output of thesecond latch 140 is provided as the PREDICT_EX signal to themisprediction logic 80. In the meantime, byclock cycle 3, the instruction has reached the executestage 220 and the ACTUAL_EX signal is also presented to themisprediction logic 80. In the event that the PREDICT_EX signal indicates that a read access should occur, the read access will occur and if the ACTUAL_EX signal resolves to indicate that a read access should not occur then the read data will be discarded. -
FIG. 3 illustrates the read access prediction technique in more detail. - At step s10, the cache access is determined using the PREDICT_EX signal.
- In the event that, at step s20, it is determined that the PREDICT_EX signal and the resolved ACTUAL_EX signal are identical then processing proceeds back to step s10.
- In the event that it is determined that the PREDICT_EX signal and the resolved ACTUAL_EX signal are not identical then processing proceeds to step s30.
- At step s30, it is determined whether a read was predicted by the PREDICT_EX signal but that the resolved ACTUAL_EX signal did not indicate that a read was required.
- If it is determined that a read was not required then, at step s40, the read data value is discarded and at step s50 the history information associated with the read
access prediction circuit 120 is updated to indicate that the instruction associated with that program counter value is not predicted to result in a read access. - If, at step s30, it is determined that the PREDICT_EX signal indicates that a read access was not predicted but that the ACTUAL_EX signal indicates that a read should occur then, at step s60, the pipeline will be stalled.
- Thereafter, at step s70, the ACTUAL_EX signal will be resolved, and in the event that the ACTUAL_EX signal continues to indicate that a read access should occur then the history information associated with the load prediction circuit will be updated to indicate that a read access should occur for the instruction associated with that program counter value.
- At step s80, the requested data value will be read from the
data RAM 30. - Finally, at step s90, the stall on the
pipeline 90 will be removed. - Through this approach, an indication that an instruction is to be processed by the pipelined data processing apparatus is received and a memory access prediction signal is then generated. The memory access prediction signal has a value indicative of whether or not the instruction is likely to cause a read access from a memory. The predicted memory access control signal is generated in a way which prevents any metastability being present in that signal. This is achieved by the predicted memory access control signal achieving and maintaining a valid logic level for at least a sampling period. A read access can then be initiated in the event that it is predicted that a read access is likely to occur. Hence, the signals used in a read access are prevented from being metastable which removes the possibility that metastable signals are used directly in the arbitration of data accesses. Also, the metastable signals may be prevented from being propagated from stage to stage.
- Although a particular embodiment of the invention has been described herewith, it will be apparent that the invention is not limited thereto, and that many modifications and additions may be made in the scope of the invention. For example, various combinations of the features of the following dependent claims could be made with the features of the independent claims without departing from the scope of the present invention.
Claims (25)
1. A method of accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable signals may occur on at least the boundaries of the pipelined stages, the method comprising the steps of:
receiving an indication that an instruction is to be processed by the pipelined data processing apparatus;
generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory;
generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby preventing any metastability in the predicted memory access control value; and
in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory.
2. The method of claim 1 , wherein the step of generating the memory access prediction signal comprises the steps of:
determining a program counter value associated with the instruction to be processed; and
referencing a lookup table to provide the value indicative of whether or not the instruction associated with that program counter value is likely to cause a read access from the memory; and
propagating the value provided by the lookup table as the memory access prediction signal.
3. The method of claim 2 , wherein the step of determining the program counter value occurs when processing the instruction during a fetch stage of the pipelined processor.
4. The method of claim 1 , further comprising the step of:
storing in the lookup table the value indicative of whether or not the instructions associated with program counter values are likely to cause read accesses from the memory.
5. The method of claim 1 , wherein the step of generating the predicted memory access control value comprises the steps of:
passing the memory access prediction signal through a synchronising structure to generate the predicted memory access control value having a valid logic level.
6. The method of claim 1 , wherein the step of generating the predicted memory access control value comprises the steps of:
passing the memory access prediction signal through a pair of latches, each latch being clocked to coincide with the passing of the instruction between subsequent boundaries of the pipelined stages.
7. The method of claim 1 , wherein the step of generating the predicted memory access control value comprises the steps of:
passing the memory access prediction signal to an input of a first latch;
providing an intermediate signal on the output of the first latch as the instruction passes between first and second pipelined stages;
passing the intermediate signal to an input of a second latch; and
providing the predicted memory access control value on the output of the second latch as the instruction passes between second and third pipelined stages.
8. The method of claim 7 , wherein the first, second and third pipelined stages comprise fetch, decode and execute pipelined stages.
9. The method of claim 6 , wherein the step of passing the memory access prediction signal through a pair of latches causes the predicted memory access control value to have timing characteristics which achieve a valid logic level prior to a setup period prior to a sampling clock transitioning, said valid logic level being held during a hold period following said sampling clock transitioning.
10. The method of claim 1 , wherein in the event that the predicted memory access control value indicates that a read access is likely to occur, the step of causing the read access to be initiated from the memory occurs when the associated instruction is being executed in the execute pipelined stage.
11. The method of claim 10 , wherein the step of generating the memory access prediction signal further includes the step of:
generating a timing value indicative of when the associated instruction is likely to be executed in the execute pipelined stage and
the step of generating the predicted memory access control value from the memory access prediction signal is responsive to the timing value such that the predicted memory access control value is provided for at least a period in which the associated instruction is likely to be executed in the execute pipelined stage.
12. The method of claim 1 , further comprising the steps of:
processing the instruction in the pipelined stages, the instruction causing an actual memory access signal to be generated;
in the event the actual memory access signal has a value indicating a read access from the memory is to occur and the predicted memory access control value indicates that a read access is not likely to occur, causing the execution of the instruction to be stalled whilst an actual memory access control value is generated from the actual memory access signal, the actual memory access control value being generated to have a valid logic level thereby removing any metastability in the actual memory access signal value, and in the event that the actual memory access control value indicates that a read access is to occur, causing a read access to be initiated from the memory.
13. An integrated circuit operable to access data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable values may occur on at least the boundaries of the pipelined stages, the integrated circuit comprising:
a read access prediction circuit operable to receive an indication that an instruction is to be processed by the pipelined data processing apparatus, the read access prediction circuit being further operable to generate a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory;
a prediction signal stabilising circuit operable to generate a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby removing any metastability in the memory access prediction signal value; and
a memory access circuit operable, in the event that the predicted memory access control value indicates that a read access is likely to occur, to cause a read access to be initiated from the memory.
14. The integrated circuit of claim 13 , wherein the read access prediction circuit is operable to determine a program counter value associated with the instruction to be processed, to reference a lookup table to provide the value indicative of whether or not the instruction associated with that program counter value is likely to cause a read access from the memory and to propagate the value provided by the lookup table in the read access prediction signal.
15. The integrated circuit of claim 14 , wherein the read access prediction circuit is operable to determine the program counter value when processing the instruction during a fetch stage of the pipelined processor.
16. The integrated circuit claim 13 , further comprising:
the lookup table operable to store the value indicative of whether or not the instructions associated with program counter values are likely to cause read accesses from the memory.
17. The integrated circuit of claim 13 , wherein the prediction signal stabilising circuit comprises:
a synchronising structure operable to receive the memory access prediction signal and to generate the predicted memory access control value having a valid logic level.
18. The integrated circuit of claim 13 , wherein the prediction signal stabilising circuit comprises:
a pair of latches operable to receive the memory access prediction signal, each latch being clocked to coincide with the passing of the instruction between subsequent boundaries of the pipelined stages.
19. The integrated circuit of claim 13 , wherein the prediction signal stabilising circuit comprises:
a first latch operable to receive the memory access prediction signal and to provide an intermediate signal on the output of the first latch as the instruction passes between first and second pipelined stages;
a second latch operable to receive passing the intermediate signal and to provide the predicted memory access control value on the output of the second latch as the instruction passes between second and third pipelined stages.
20. The integrated circuit of claim 19 , wherein the first, second and third pipelined stages comprise fetch, decode and execute pipelined stages.
21. The integrated circuit of claim 17 , wherein the prediction signal stabilising circuit comprises:
a pair of latches operable to receive the memory access prediction signal and to generate the predicted memory access control value having timing characteristics which achieve a valid logic level prior to a setup period prior to a sampling clock transitioning, said valid logic level being held during a hold period following said sampling clock transitioning.
22. The integrated circuit of claim 13 , wherein the memory access circuit is operable, in the event that the predicted memory access control value indicates that a read access is likely to occur, to cause the read access to be initiated from the memory occurs when the associated instruction is being executed in the execute pipelined stage.
23. The integrated circuit of claim 22 , wherein the read access prediction circuit is operable to generate a timing value indicative of when the associated instruction is likely to be executed in the execute pipelined stage and
the prediction signal stabilising circuit is operable to generate the predicted memory access control value from the memory access prediction signal in response to the timing value such that the predicted memory access control value is provided for at least a period in which the associated instruction is likely to be executed in the execute pipelined stage.
24. The integrated circuit of claim 13 , further comprising:
pipelined stages operable to process the instruction, the instruction causing an actual memory access signal to be generated;
stall logic operable, in the event the actual memory access signal has a value indicating a read access from the memory is to occur and the predicted memory access control value indicates that a read access is not likely to occur, to cause the execution of the instruction to be stalled whilst an actual memory access control value is generated from the actual memory access signal, the actual memory access control value being generated to have a valid logic level thereby removing any metastability in the actual memory access signal value, and in the event that the actual memory access control value indicates that a read access is to occur, causing a read access to be initiated from the memory.
25. An integrated circuit for accessing data in a pipelined data processing apparatus in which the operating conditions of the pipelined data processing apparatus are such that metastable values may occur on at least the boundaries of the pipelined stages, the integrated circuit comprising:
read access prediction means for receiving an indication that an instruction is to be processed by the pipelined data processing apparatus and for generating a memory access prediction signal, the memory access prediction signal having a value indicative of whether or not the instruction is likely to cause a read access from a memory;
prediction signal stabilising means for generating a predicted memory access control value from the memory access prediction signal, the predicted memory access control value being generated to achieve and maintain a valid logic level for at least a sampling period thereby removing any metastability in the memory access prediction signal value; and
memory access means for, in the event that the predicted memory access control value indicates that a read access is likely to occur, causing a read access to be initiated from the memory.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/121,309 US20060253677A1 (en) | 2005-05-04 | 2005-05-04 | Data access prediction |
US12/068,598 US7653795B2 (en) | 2005-05-04 | 2008-02-08 | Control of metastability in the pipelined data processing apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/121,309 US20060253677A1 (en) | 2005-05-04 | 2005-05-04 | Data access prediction |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/068,598 Continuation US7653795B2 (en) | 2005-05-04 | 2008-02-08 | Control of metastability in the pipelined data processing apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060253677A1 true US20060253677A1 (en) | 2006-11-09 |
Family
ID=37395324
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/121,309 Abandoned US20060253677A1 (en) | 2005-05-04 | 2005-05-04 | Data access prediction |
US12/068,598 Active 2025-08-16 US7653795B2 (en) | 2005-05-04 | 2008-02-08 | Control of metastability in the pipelined data processing apparatus |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/068,598 Active 2025-08-16 US7653795B2 (en) | 2005-05-04 | 2008-02-08 | Control of metastability in the pipelined data processing apparatus |
Country Status (1)
Country | Link |
---|---|
US (2) | US20060253677A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110153978A1 (en) * | 2009-12-21 | 2011-06-23 | International Business Machines Corporation | Predictive Page Allocation for Virtual Memory System |
US11620510B2 (en) * | 2019-01-23 | 2023-04-04 | Samsung Electronics Co., Ltd. | Platform for concurrent execution of GPU operations |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9189014B2 (en) * | 2008-09-26 | 2015-11-17 | Intel Corporation | Sequential circuit with error detection |
US7928768B1 (en) * | 2009-09-28 | 2011-04-19 | Altera Corporation | Apparatus for metastability-hardened storage circuits and associated methods |
US8599626B2 (en) | 2011-12-07 | 2013-12-03 | Arm Limited | Memory device and a method of operating such a memory device in a speculative read mode |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5377336A (en) * | 1991-04-18 | 1994-12-27 | International Business Machines Corporation | Improved method to prefetch load instruction data |
US6415380B1 (en) * | 1998-01-28 | 2002-07-02 | Kabushiki Kaisha Toshiba | Speculative execution of a load instruction by associating the load instruction with a previously executed store instruction |
US20020091915A1 (en) * | 2001-01-11 | 2002-07-11 | Parady Bodo K. | Load prediction and thread identification in a multithreaded microprocessor |
US20040003218A1 (en) * | 2002-06-28 | 2004-01-01 | Fujitsu Limited | Branch prediction apparatus and branch prediction method |
US6681317B1 (en) * | 2000-09-29 | 2004-01-20 | Intel Corporation | Method and apparatus to provide advanced load ordering |
US20040064663A1 (en) * | 2002-10-01 | 2004-04-01 | Grisenthwaite Richard Roy | Memory access prediction in a data processing apparatus |
US6781429B1 (en) * | 2003-06-18 | 2004-08-24 | Advanced Micro Devices | Latch circuit with metastability trap and method therefor |
US6986027B2 (en) * | 2000-05-26 | 2006-01-10 | International Business Machines Corporation | Universal load address/value prediction using stride-based pattern history and last-value prediction in a two-level table scheme |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7035997B1 (en) * | 1998-12-16 | 2006-04-25 | Mips Technologies, Inc. | Methods and apparatus for improving fetching and dispatch of instructions in multithreaded processors |
-
2005
- 2005-05-04 US US11/121,309 patent/US20060253677A1/en not_active Abandoned
-
2008
- 2008-02-08 US US12/068,598 patent/US7653795B2/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5377336A (en) * | 1991-04-18 | 1994-12-27 | International Business Machines Corporation | Improved method to prefetch load instruction data |
US6415380B1 (en) * | 1998-01-28 | 2002-07-02 | Kabushiki Kaisha Toshiba | Speculative execution of a load instruction by associating the load instruction with a previously executed store instruction |
US6986027B2 (en) * | 2000-05-26 | 2006-01-10 | International Business Machines Corporation | Universal load address/value prediction using stride-based pattern history and last-value prediction in a two-level table scheme |
US6681317B1 (en) * | 2000-09-29 | 2004-01-20 | Intel Corporation | Method and apparatus to provide advanced load ordering |
US20020091915A1 (en) * | 2001-01-11 | 2002-07-11 | Parady Bodo K. | Load prediction and thread identification in a multithreaded microprocessor |
US20040003218A1 (en) * | 2002-06-28 | 2004-01-01 | Fujitsu Limited | Branch prediction apparatus and branch prediction method |
US20040064663A1 (en) * | 2002-10-01 | 2004-04-01 | Grisenthwaite Richard Roy | Memory access prediction in a data processing apparatus |
US6781429B1 (en) * | 2003-06-18 | 2004-08-24 | Advanced Micro Devices | Latch circuit with metastability trap and method therefor |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110153978A1 (en) * | 2009-12-21 | 2011-06-23 | International Business Machines Corporation | Predictive Page Allocation for Virtual Memory System |
US11620510B2 (en) * | 2019-01-23 | 2023-04-04 | Samsung Electronics Co., Ltd. | Platform for concurrent execution of GPU operations |
Also Published As
Publication number | Publication date |
---|---|
US20080209152A1 (en) | 2008-08-28 |
US7653795B2 (en) | 2010-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3516508B1 (en) | Memory violation prediction | |
US6098166A (en) | Speculative issue of instructions under a load miss shadow | |
US7178010B2 (en) | Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack | |
US8589763B2 (en) | Cache memory system | |
JP5357017B2 (en) | Fast and inexpensive store-load contention scheduling and transfer mechanism | |
US9886385B1 (en) | Content-directed prefetch circuit with quality filtering | |
US7730346B2 (en) | Parallel instruction processing and operand integrity verification | |
US9940137B2 (en) | Processor exception handling using a branch target cache | |
US6760835B1 (en) | Instruction branch mispredict streaming | |
US9075726B2 (en) | Conflict resolution of cache store and fetch requests | |
EP2503453A1 (en) | Processor core with data error detection, and method for instruction execution in the same, with error avoidance | |
US7653795B2 (en) | Control of metastability in the pipelined data processing apparatus | |
JP5089226B2 (en) | Hardware assisted exceptions for software miss handling of I/O address translation cache misses - Patents.com | |
US7962726B2 (en) | Recycling long multi-operand instructions | |
US11113065B2 (en) | Speculative instruction wakeup to tolerate draining delay of memory ordering violation check buffers | |
US20150248138A1 (en) | Storage circuitry and method for propagating data values across a clock boundary | |
US8495287B2 (en) | Clock-based debugging for embedded dynamic random access memory element in a processor core | |
US7843760B2 (en) | Interface circuit and method for coupling between a memory device and processing circuitry | |
US20180349278A1 (en) | Translation lookaside buffer purging with concurrent cache updates | |
US10310858B2 (en) | Controlling transition between using first and second processing circuitry | |
US7890739B2 (en) | Method and apparatus for recovering from branch misprediction | |
US9535697B2 (en) | Register window performance via lazy register fills | |
JP4253319B2 (en) | System for handling buffer data discard conditions | |
JP2005010995A (en) | Multiprocessor system and process for dealing with trouble of write-back thereof | |
US20080201531A1 (en) | Structure for administering an access conflict in a computer memory cache |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ARM LIMITED, UNITED KINGDOM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BULL, DAVID MICHAEL;REEL/FRAME:016780/0597 Effective date: 20050519 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |