-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Description
Hello,
latest ipex-llm ( ipex-llm==2.3.0b20251104) has issue with models:
gpt-oss 120b, gpt-oss 20B
ENV
(ipex-llm) arc@xpu:~/llm/ipex-llm/llama-cpp$ uv pip install --pre --upgrade ipex-llm[cpp]
Using Python 3.11.14 environment at: /home/arc/miniconda3/envs/ipex-llm
Resolved 45 packages in 1.12s
Prepared 9 packages in 1.84s
Uninstalled 9 packages in 244ms
Installed 9 packages in 218ms
- bigdl-core-cpp==2.7.0b20251022
+ bigdl-core-cpp==2.7.0b20251104
- fsspec==2025.9.0
+ fsspec==2025.10.0
- hf-xet==1.1.11b1
+ hf-xet==1.2.0
- huggingface-hub==0.36.0rc0
+ huggingface-hub==0.36.0
- ipex-llm==2.3.0b20251022
+ ipex-llm==2.3.0b20251104
- networkx==3.5
+ networkx==3.6rc0
- psutil==7.1.1
+ psutil==7.1.3
- regex==2025.10.23
+ regex==2025.11.3
- safetensors==0.6.2
+ safetensors==0.7.0rc0
llama-server
tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)
(ipex-llm) arc@xpu:~/llm/ipex-llm/llama-cpp$ ./llama-server -m ~/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf -ngl 99 -c 8192 --jinja
build: 1 (98abe88) with Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205) for x86_64-unknown-linux-gnu
system info: n_threads = 36, n_threads_batch = 36, total_threads = 72
system_info: n_threads = 36 (n_threads_batch = 36) / 72 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
main: binding port with default address family
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 71
main: loading model
srv load_model: loading model '/home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf'
llama_model_load_from_file_impl: using device SYCL0 (Intel(R) Arc(TM) A770 Graphics) - 15473 MiB free
llama_model_load_from_file_impl: using device SYCL1 (Intel(R) Arc(TM) A770 Graphics) - 15473 MiB free
gguf_init_from_file_impl: tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)
gguf_init_from_file_impl: failed to read tensor info
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf'
srv load_model: failed to load model, '/home/arc/llm/models/UD-Q8_K_XL/gpt-oss-120b-UD-Q8_K_XL-00001-of-00002.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
Metadata
Metadata
Assignees
Labels
No labels