这是indexloc提供的服务,不要输入任何密码
Skip to content

GPT-OSS 120B ERROR llama.cpp ipex-llm==2.3.0b20251104 #13331

@savvadesogle

Description

@savvadesogle

Hello,
latest ipex-llm ( ipex-llm==2.3.0b20251104) has issue with models:
gpt-oss 120b, gpt-oss 20B

ENV

(ipex-llm) arc@xpu:~/llm/ipex-llm/llama-cpp$ uv pip install --pre --upgrade ipex-llm[cpp]
Using Python 3.11.14 environment at: /home/arc/miniconda3/envs/ipex-llm
Resolved 45 packages in 1.12s
Prepared 9 packages in 1.84s
Uninstalled 9 packages in 244ms
Installed 9 packages in 218ms
 - bigdl-core-cpp==2.7.0b20251022
 + bigdl-core-cpp==2.7.0b20251104
 - fsspec==2025.9.0
 + fsspec==2025.10.0
 - hf-xet==1.1.11b1
 + hf-xet==1.2.0
 - huggingface-hub==0.36.0rc0
 + huggingface-hub==0.36.0
 - ipex-llm==2.3.0b20251022
 + ipex-llm==2.3.0b20251104
 - networkx==3.5
 + networkx==3.6rc0
 - psutil==7.1.1
 + psutil==7.1.3
 - regex==2025.10.23
 + regex==2025.11.3
 - safetensors==0.6.2
 + safetensors==0.7.0rc0

llama-server

tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)

(ipex-llm) arc@xpu:~/llm/ipex-llm/llama-cpp$ ./llama-server -m ~/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf -ngl 99 -c 8192 --jinja
build: 1 (98abe88) with Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205) for x86_64-unknown-linux-gnu
system info: n_threads = 36, n_threads_batch = 36, total_threads = 72

system_info: n_threads = 36 (n_threads_batch = 36) / 72 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 

main: binding port with default address family
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 71
main: loading model
srv    load_model: loading model '/home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf'
llama_model_load_from_file_impl: using device SYCL0 (Intel(R) Arc(TM) A770 Graphics) - 15473 MiB free
llama_model_load_from_file_impl: using device SYCL1 (Intel(R) Arc(TM) A770 Graphics) - 15473 MiB free
gguf_init_from_file_impl: tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)
gguf_init_from_file_impl: failed to read tensor info
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf'
srv    load_model: failed to load model, '/home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions