+
Skip to content

Code for the paper “Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation”

License

Notifications You must be signed in to change notification settings

SnowNation101/Nyx

Repository files navigation

Nyx Logo

🌓 Nyx: Unified Multimodal Retriever for MRAG

Arxiv Paper GitHub License GitHub Repo stars

This repository contains the official implementation of our paper "Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation".

Introduction

We propose Nyx, a unified mixed-modal retriever tailored for URAG scenarios, and construct NyxQA, a large-scale mixed-modal QA dataset. Our framework includes:

  • A four-stage automated pipeline for generating realistic multimodal QA pairs.
  • A two-stage training framework combining pre-training on NyxQA and supervised fine-tuning with VLM feedback.
  • Strong performance on both text-only RAG benchmarks and vision-language URAG tasks.

Preparation

We recommend using Conda for package management.

conda create -n nyx python=3.11
conda activate nyx
pip install -r requirements.txt

Our implementation uses torch==2.4.0, faiss-cpu==1.8.0, and transformers==4.52.2. Please note that faiss-cpu and transformers might have numpy version conflicts. We prefer keeping numpy at version 1.26.4 (the version compatible with faiss-cpu), so you may need to uninstall any newer numpy versions.

Suggested installation order: PyTorch → faiss-cpu → transformers → accelerate → deepspeed

Acknowledgements

The core implementation of this project is built upon VLM2Vec. We extend our sincere gratitude to the original authors for their foundational work.

We also want to acknowledge and thank the developers of these essential tools that made our work possible:

  • vLLM for efficient LLM inferencing
  • FlashAttention for optimized attention computation
  • DeepSpeed for distributed training acceleration

Our work stands on the shoulders of these remarkable open-source projects and the generous research community.

We also want to note that the logo at the top of this README is adapted from the character Nyx in the game Hades by Supergiant Games.

About

Code for the paper “Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation”

Topics

Resources

License

Stars

Watchers

Forks

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载