+

Bause et al., 2024 - Google Patents

A Configurable and Efficient Memory Hierarchy for Neural Network Hardware Accelerator

Bause et al., 2024

View PDF
Document ID
12335131068162514311
Author
Bause O
Bernardo P
Bringmann O
Publication year
Publication venue
arXiv preprint arXiv:2404.15823

External Links

Snippet

As machine learning applications continue to evolve, the demand for efficient hardware accelerators, specifically tailored for deep neural networks (DNNs), becomes increasingly vital. In this paper, we propose a configurable memory hierarchy framework tailored for per …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0846Cache with multiple tag or data arrays being simultaneously accessible
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/44Arrangements for executing specific programmes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power Management, i.e. event-based initiation of power-saving mode
    • G06F1/3234Action, measure or step performed to reduce power consumption
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/25Using a specific main memory architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements

Similar Documents

Publication Publication Date Title
Lee et al. Decoupled direct memory access: Isolating CPU and IO traffic by leveraging a dual-data-port DRAM
US20210181974A1 (en) Systems and methods for low-latency memory device
Huang et al. Shuhai: A tool for benchmarking high bandwidth memory on FPGAs
JP4188233B2 (en) Integrated circuit device
Oliveira et al. MIMDRAM: An end-to-end processing-using-DRAM system for high-throughput, energy-efficient and programmer-transparent multiple-instruction multiple-data computing
CN114356840B (en) SoC system with in-memory/near-memory computing modules
US8706453B2 (en) Techniques for processor/memory co-exploration at multiple abstraction levels
Mathew et al. Design of a parallel vector access unit for SDRAM memory systems
EP4038506A1 (en) Hardware acceleration
Chandu A survey of memory controller architectures: Design trends and performance trade-offs
Haris et al. SECDA-TFLite: A toolkit for efficient development of FPGA-based DNN accelerators for edge inference
EP1845462B1 (en) Hardware emulations system for integrated circuits having a heterogeneous cluster of processors
US8706469B2 (en) Method and apparatus for increasing the efficiency of an emulation engine
Mishra et al. Processor-memory co-exploration driven by a memory-aware architecture description language
US7827023B2 (en) Method and apparatus for increasing the efficiency of an emulation engine
Van Lunteren et al. Coherently attached programmable near-memory acceleration platform and its application to stencil processing
US7606698B1 (en) Method and apparatus for sharing data between discrete clusters of processors
JP2009527858A (en) Hardware emulator with variable input primitives
Bause et al. A Configurable and Efficient Memory Hierarchy for Neural Network Hardware Accelerator
Vieira et al. Ndpmulator: Enabling full-system simulation for near-data accelerators from caches to dram
Vieira et al. A compute cache system for signal processing applications
US6604163B1 (en) Interconnection of digital signal processor with program memory and external devices using a shared bus interface
Bayliss et al. Methodology for designing statically scheduled application-specific SDRAM controllers using constrained local search
Abuelala A Framework for Comprehensive and Customizable Memory-Centric Workloads
US8468009B1 (en) Hardware emulation unit having a shadow processor
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载