-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Labels
Description
Describe the bug
no output when set --low-bit to sym_int8, mixed_fp8,fp4.
set sym_int4 work.
How to reproduce
Steps to reproduce the error:
- build conda environment :
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
conda install pkg-config libuv
python -m pip install torch==2.1.0a0 torchvision==0.16.0a0 torchaudio==2.1.0a0 intel-extension-for-pytorch==2.1.10 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
- infer:
python ./transformers_low_bit_pipeline.py --repo-id-or-model-path D:\StreamingMedia\model\LLM\Qwen2-1.5B-Instruct --low-bit sym_int8
Screenshots
sym_int4 output:
sym_int8 output:
when i debug, the code is break in \ipex_llm\transformers\low_bit_linear.py:
result = xe_linear.forward_new(x_2d, w, self.qtype, self.out_len)
transformers_low_bit_pipeline.py code is from transformers_low_bit_pipeline.py