起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
Hi Qwen-VL Team,
Thank you for your amazing work on Qwen-VL!!!! it’s a powerful and much-appreciated contribution to the community.
I’d like to kindly request the release of the text encoder trained jointly with the image encoder. This would enable broader use in tasks like multi-modal retrieval, alignment-based applications, and research on cross-modal embeddings.
It would be a valuable addition to an already excellent project. Thank you for considering this!
Best regards,
Michael
基本示例 | Basic Example
like other Image-Text Retrieval.
缺陷 | Drawbacks
tiny effort - release it
未解决问题 | Unresolved questions
No response