Could not create a primitive descriptor for a matmul primitive during qwen3 32b inference with portable llama.cpp

**Describe the bug**
1. Install latest driver and install latest windows version llama.cpp form https://github.com/ipex-llm/ipex-llm/releases/tag/v2.3.0-nightly
2. download Qwen3-32B-Q8_0.gguf model
3. prepare a 64GB machine with intel igpus
4.  Modify regkey:
[HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers\MemoryManager] "SystemPartitionCommitLimitPercentage" 
set it to 0x4b ,which means use 75% memory as shared gpu memory
6. llama-cli.exe -m ..\models\Qwen3\Qwen3-32B-Q8_0.gguf -p "how to become an expert on GPU driver" -n 2048 -e -ngl 999 --color -c 2500 --temp 0 -no-cnv

Then will meet issue:
llama_context:  SYCL_Host  output buffer size =     0.58 MiB
llama_kv_cache_unified:      SYCL0 KV buffer size =   632.00 MiB
llama_kv_cache_unified: size =  632.00 MiB (  2528 cells,  64 layers,  1 seqs), K (f16):  316.00 MiB, V (f16):  316.00 MiB
llama_context:      SYCL0 compute buffer size =  1497.80 MiB
llama_context:  SYCL_Host compute buffer size =    73.54 MiB
llama_context: graph nodes  = 2054
llama_context: graph splits = 2
common_init_from_params: setting dry_penalty_last_n to ctx_size = 2528
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
**could not create a primitive descriptor for a matmul primitive**
**Exception caught at file:D:\actions-runner\release-cpp-oneapi_2024_2\_work\llm.cpp\llm.cpp\llama-cpp-bigdl\ggml\src\ggml-sycl\ggml-sycl.cpp, line:3552, func:operator()
SYCL error: CHECK_TRY_ERROR(op(ctx, src0, src1, dst, src0_dd_i, src1_ddf_i, src1_ddq_i, dst_dd_i, dev[i].split_dim_low, dev[i].split_dim_high, src1_ncols, src1_padded_col_size, stream)): Exception caught in this line of code.**
  in function ggml_sycl_op_mul_mat at D:\actions-runner\release-cpp-oneapi_2024_2\_work\llm.cpp\llm.cpp\llama-cpp-bigdl\ggml\src\ggml-sycl\ggml-sycl.cpp:3552
D:\actions-runner\release-cpp-oneapi_2024_2\_work\llm.cpp\llm.cpp\llama-cpp-bigdl\ggml\src\ggml-sycl\..\ggml-sycl\common.hpp:127: SYCL error


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Could not create a primitive descriptor for a matmul primitive during qwen3 32b inference with portable llama.cpp #13320

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Could not create a primitive descriptor for a matmul primitive during qwen3 32b inference with portable llama.cpp #13320

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions