这是indexloc提供的服务,不要输入任何密码
Skip to content

[Bug]: Poor support for Hebrew #1114

@shohamyamin

Description

@shohamyamin

Model Series

Qwen2.5

What are the models used?

Qwen2.5-Turbo-1M-Demo

What is the scenario where the problem happened?

when using hebrew in the input

Is this a known issue?

  • I have followed the GitHub README.
  • I have checked the Qwen documentation and cannot find an answer there.
  • I have checked the documentation of the related framework and cannot find useful information.
  • I have searched the issues and there is not a similar one.

Information about environment

I used the Qwen/Qwen2.5-Turbo-1M-Demo in hugging face

Log output

input: 
"היי אתה יכול לרשום לי פסקה בעברית על כנסת ישראל?"

output:
"แน่นอน! 以下是关于以色列议会的段落,但我将用希伯来语书写:

כנסת ישראל היא הגוף המחוקק העיקרי במדינת ישראל. היא כוללת 120 נציגים שנבחרים בבחירות כלליות אחת לשנתיים. הכנסתי的第一个职能是制定法律,监管国家的行政机构,并批准预算。ckeet以色列议会的成员被称为国会议员(MK),他们代表不同的政党或独立候选人。议会大厦位于耶路撒冷。议会的历史可以追溯到1949年,当时第一届议会选举举行。自那时以来,议会经历了许多变化和发展,反映了以色列社会的多样性和复杂性。"

Description

Steps to reproduce

This happens to Qwen/Qwen2.5-Turbo-1M-Demo and every qwen model
The problem can be reproduced with the following steps:

  1. asking a qwen model a question in Hebrew like: "היי אתה יכול לרשום לי פסקה בעברית על כנסת ישראל?" meaning "write me a paragraph in Hebrew about Israeli Knesset?"

Expected results

The results are expected to be:
The all answer should be in Hebrew without any chines letters or irrelevant English words

Attempts to fix

I have tried several ways to fix this, including:

  1. trying different system prompts
  2. trying different questions

Anything else helpful for investigation

I find that this problem also happens to all qwen models. it seems that the support for Hebrew is not good at all

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions