WO2006017874A2 - Cache d'instruction pour systemes fonctionnant en temps reel - Google Patents
Cache d'instruction pour systemes fonctionnant en temps reel Download PDFInfo
- Publication number
- WO2006017874A2 WO2006017874A2 PCT/AT2005/000326 AT2005000326W WO2006017874A2 WO 2006017874 A2 WO2006017874 A2 WO 2006017874A2 AT 2005000326 W AT2005000326 W AT 2005000326W WO 2006017874 A2 WO2006017874 A2 WO 2006017874A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cache
- function
- instruction cache
- real
- time
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30054—Unconditional branch instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3808—Instruction prefetching for instruction reuse, e.g. trace cache, branch target cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3814—Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4482—Procedural
- G06F9/4484—Executing subprograms
Definitions
- the invention relates to a real-time instruction cache.
- the maximum execution time of programs must be done by analyzing the programs and the necessary modeling of the system. A measurement of the execution time is not possible since it can not be ensured that all combinations of execution paths have been run through.
- Cache memory is an important part of processors to compensate for the Geschwin ⁇ dtechniksunter Kunststoff between main memory and processor.
- the well-known cache architectures are optimized for the average performance and not for predictable performance. This leads to difficult to predict or very pes ⁇ simistic WCET values.
- caches of real processors are analyzed. Architectural features mean that only 1/2 or 1/4 of the existing cache memory can be modeled.
- the object of the invention is to design an instruction cache whose real-time behavior can be modeled more accurately without restricting the program size.
- the task is solved by storing complete functions in the instruction cache.
- the instruction cache is loaded only when necessary during a function call or during a function return.
- the program counter 102 Due to the relative addressing, it does not matter at which cache position the function starts during the function execution.
- the program counter 102 only has to be loaded in the cache with the start address when the function is called.
- a function can span several contiguous blocks. Whereby a connection also exists from the last block to the first block, since the program counter is limited to the cache addressing and this leads to an automatically correct overflow or underflow.
- the program counter 102 and the associated buses and multiplexers can be implemented more simply because they are smaller. Also the address translation for the implementation of a virtual memory is only necessary when loading a function.
- the determination of a 'cach hit' is necessary only at the function call or at the return and is solved by reading the block RAM 105.
- the access to the 'tag RAM' and the address comparison are usually in the critical path of the hardware and thereby determine the minimum access time to the cache. Without comparison on each access, as in this invention, the access time to the cache can be reduced with the same technology.
- This memory technology is characterized by the fact that the first word is available only after a considerable delay, but the following is available in a shorter time. This initial delay has less effect on larger blocks than on small blocks.
- FIG. 1 shows the architecture of a processor which contains the subject matter of the invention.
- 2 shows by way of example the assignment of the cache blocks during execution of the program fragment in FIG. 3.
- the instruction cache 103 is located between the processor core 101 and the bus interface 104. Instructions are fetched via the bus 112 from the instruction cache 103. The instruction cache 103 is addressed via the program counter 102. Since this only addresses the cache, this and the associated buses 110 and 111 must be log2 (cache size) bits wide.
- the instruction cache 103 is filled by the bus interface 104 from the main memory 201 with complete functions.
- the buses 113 and 114 are the address or data buses between the bus interface 104 and the instruction cache 103. Load and memory requirements of the processor core 101 are handled via the address bus 117 and the data bus 118.
- the bus interface 104 handles the data exchange and loading of the instruction cache 103 with the main memory 201 via the address bus 210 and the data bus 211. Since the loading of the instruction cache 103 occurs only upon a function call or a return from a function, there are no conflicts with the load and memory requirements of the processor core 101.
- the block RAM 105 is used by the processor to store which blocks of the instruction cache 103 are occupied by which functions. It is addressed via the address bus 116 and the data bus 115.
- FIG. 2 shows the occupancy of cache blocks during the execution of the program outlined in FIG.
- the number of blocks and the strategy which blocks are replaced is only an example.
- the replacement strategy can be more complex than with conventional instruction caches because the decision is less likely (only when loading a complete function).
- the assignment of the blocks is stored in block RAM 105. This must be read to determine if a 'cache hit' exists and is written when a function is newly loaded into the instruction cache.
- the example in Figure 2 consists of 4 functions, where the functions A () and D () are small enough to fit in a block.
- Functions B () and C () are larger and occupy two blocks.
- 301 shows the state after the call of the function A (). The first block is occupied, the remaining three are free. Calling function B () within A () results in occupancy as shown in 302. There is only one more block left. The C () function called by B (), however, requires two blocks. As shown in 303, C () is loaded in block 4 and block 1, whereby function A () is no longer in the cache.
- the addressing of the function C () via the cache end (block 4) to the beginning of the cache (block 1) is implicitly effected by the limitation of the program counter 102 to the cache Cache size.
- the addition or subtraction beyond the cache boundary implicitly results in the correct overflow or underflow of the program counter 102.
- D () function B () or function C () is flushed out of the cache depends on the replacement strategy.
- the next block after a loaded function is used as start block for a new function to be loaded.
- the division into four blocks is only an example to simplify the illustration.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AT0932505A AT505203A5 (de) | 2004-08-17 | 2005-08-12 | Instruction cache für echtzeitsysteme |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AT0138304A AT500858B8 (de) | 2004-08-17 | 2004-08-17 | Instruction cache für echtzeitsysteme |
ATA1383/2004 | 2004-08-17 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2006017874A2 true WO2006017874A2 (fr) | 2006-02-23 |
WO2006017874A3 WO2006017874A3 (fr) | 2006-11-23 |
Family
ID=35134456
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/AT2005/000326 WO2006017874A2 (fr) | 2004-08-17 | 2005-08-12 | Cache d'instruction pour systemes fonctionnant en temps reel |
Country Status (2)
Country | Link |
---|---|
AT (2) | AT500858B8 (fr) |
WO (1) | WO2006017874A2 (fr) |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4755935A (en) * | 1986-01-27 | 1988-07-05 | Schlumberger Technology Corporation | Prefetch memory system having next-instruction buffer which stores target tracks of jumps prior to CPU access of instruction |
JPH01205228A (ja) * | 1988-02-10 | 1989-08-17 | Hitachi Ltd | 命令バツフアシステム |
US5490262A (en) * | 1989-09-01 | 1996-02-06 | Oki Electric Industry Co., Ltd. | Dual cache memory device with cache monitoring |
GB9118312D0 (en) * | 1991-08-24 | 1991-10-09 | Motorola Inc | Real time cache implemented by dual purpose on-chip memory |
US5353425A (en) * | 1992-04-29 | 1994-10-04 | Sun Microsystems, Inc. | Methods and apparatus for implementing a pseudo-LRU cache memory replacement scheme with a locking feature |
US5974534A (en) * | 1994-02-14 | 1999-10-26 | Hewlett-Packard Company | Predecoding and steering mechanism for instructions in a superscalar processor |
KR19990087830A (ko) * | 1996-03-28 | 1999-12-27 | 포만 제프리 엘 | 컴퓨터 장치, 컴파일러 방법 및 다중 캐시 라인 사전적재방법 |
US5893142A (en) * | 1996-11-14 | 1999-04-06 | Motorola Inc. | Data processing system having a cache and method therefor |
US5913224A (en) * | 1997-02-26 | 1999-06-15 | Advanced Micro Devices, Inc. | Programmable cache including a non-lockable data way and a lockable data way configured to lock real-time data |
US6157981A (en) * | 1998-05-27 | 2000-12-05 | International Business Machines Corporation | Real time invariant behavior cache |
US7143268B2 (en) * | 2000-12-29 | 2006-11-28 | Stmicroelectronics, Inc. | Circuit and method for instruction compression and dispersal in wide-issue processors |
DE10204345A1 (de) * | 2002-02-01 | 2003-08-14 | Systemonic Ag | Verfahren zur Befehlsbearbeitung |
-
2004
- 2004-08-17 AT AT0138304A patent/AT500858B8/de not_active IP Right Cessation
-
2005
- 2005-08-12 AT AT0932505A patent/AT505203A5/de not_active Application Discontinuation
- 2005-08-12 WO PCT/AT2005/000326 patent/WO2006017874A2/fr active Application Filing
Also Published As
Publication number | Publication date |
---|---|
AT500858B8 (de) | 2007-02-15 |
AT500858B1 (de) | 2006-04-15 |
WO2006017874A3 (fr) | 2006-11-23 |
AT505203A5 (de) | 2008-11-15 |
AT500858A4 (de) | 2006-04-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE60210633T2 (de) | Verfahren und vorrichtungen zur verbesserung des durchsatzes von eingebetteten prozessoren auf cache-basis durch umschalten von tasks als reaktion auf eine cache-verfehlung | |
DE69031991T2 (de) | Verfahren und Gerät zur Beschleunigung von Verzweigungsbefehlen | |
DE69129919T2 (de) | Verfahren zur Kompilierung von Rechnerbefehlen, um Cachespeicherleistung zu verbessern | |
DE10084556B4 (de) | Optimierte Ausführung von statisch sehr wahrscheinlich vorhergesagten Verzweigungsbefehlen | |
DE69434728T2 (de) | Synchronisationssystem und verfahren in einem datencachesystem mit aufgeteiltem pegel | |
DE69114333T2 (de) | Rechner mit der Fähigkeit mehrere Befehle gleichzeitig auszuführen. | |
DE69032812T2 (de) | Vorrichtung und Verfahren zur parallelen Verarbeitung | |
DE19526007C2 (de) | Horizontal partitionierter Befehls-Cache-Speicher | |
DE69824688T2 (de) | System und Verfahren zur Leistungsoptimierung eines Rechnersystems | |
DE4222776C2 (de) | Parallelverarbeitungseinheit und Verfahren zum Ausführen von Befehlen | |
DE69224084T2 (de) | Rechneranordnung mit Mehrfachpufferdatencachespeicher und Verfahren dafür | |
DE69130858T2 (de) | Überlappende Serienverarbeitung | |
DE69030931T2 (de) | Mehrfachsequenzprozessorsystem | |
DE68924400T2 (de) | Fliessbanddatenverarbeitungsvorrichtung. | |
DE102004013676A1 (de) | Schleifenbetrieb mit null Overhead in einem Mikroprozessor mit Anweisungspuffer | |
DE3933849A1 (de) | Prozessorgesteuerte schnittstelle | |
DE68924719T2 (de) | Vorrichtung und Verfahren zur Ausführung eines Unterprogramms in einem Datenverarbeitungssystem mit Blockumschaltung. | |
DE10393803T5 (de) | Verfahren und Vorrichtung zum Bestimmen einer Seitenverwaltungsimplementierung bei dynamischem Speicher mit wahlfreiem Zugriff | |
DE102006041444B4 (de) | Schaltungsanordnung und Verfahren zum Erfassen einer Ausführungszeit eines Befehls in einem Rechnersystem | |
DE10045188A1 (de) | Cacheadresskonfliktvorrichtung ohne Speicherpuffer | |
DE69322244T2 (de) | Verfahren und System zur Erhöhung der Systemspeichergleichzeitigkeit eines Multiprozessor-Rechnersystems | |
DE19848742C2 (de) | Registerumbenennung bei einem Prozessor, der Instruktionen ausserhalb der sequentiellen Reihenfolge bearbeiten kann | |
DE60127520T2 (de) | Prozessor mit Befehlscache mit niedrigem Stromverbrauch | |
DE19526008A1 (de) | Vertikal partitionierter, primärer Befehls-Cache-Speicher | |
DE102014103139A1 (de) | Parallelisierte Ausführung von Single-Core Steuerungssoftware auf Multi-Core Fahrzeugsteuergeräten |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |