这是indexloc提供的服务,不要输入任何密码
Skip to content

💡 [REQUEST] - <title>Release of Jointly-Trained Text Encoder (like CLIP) #510

@michaelsheka

Description

@michaelsheka

起始日期 | Start Date

No response

实现PR | Implementation PR

No response

相关Issues | Reference Issues

No response

摘要 | Summary

Hi Qwen-VL Team,

Thank you for your amazing work on Qwen-VL!!!! it’s a powerful and much-appreciated contribution to the community.

I’d like to kindly request the release of the text encoder trained jointly with the image encoder. This would enable broader use in tasks like multi-modal retrieval, alignment-based applications, and research on cross-modal embeddings.

It would be a valuable addition to an already excellent project. Thank you for considering this!

Best regards,
Michael

基本示例 | Basic Example

like other Image-Text Retrieval.

缺陷 | Drawbacks

tiny effort - release it

未解决问题 | Unresolved questions

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions