default search action
Fan Yu 0002
Person information
- affiliation: Alibaba Group, Speech Lab of DAMO Academy, China
Other persons with the same name
- Fan Yu — disambiguation page
- Fan Yu 0001
— Central South University, School of Traffic and Transport Engineering, Changsha, China (and 1 more)
- Fan Yu 0003
— Nanjing University, State Key Laboratory for Novel Software Technology, China
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Conference and Workshop Papers
- 2025
- [c23]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration. AAAI 2025: 24840-24848 - [c22]Guanrou Yang, Fan Yu, Ziyang Ma, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap. ICASSP 2025: 1-5 - 2024
- [c21]Fan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang:
Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition. ICASSP 2024: 7940-7944 - [c20]Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang:
LCB-Net: Long-Context Biasing for Audio-Visual Speech Recognition. ICASSP 2024: 10621-10625 - [c19]Haoxu Wang, Fan Yu, Xian Shi, Yuezhang Wang, Shiliang Zhang, Ming Li:
SlideSpeech: A Large Scale Slide-Enriched Audio-Visual Corpus. ICASSP 2024: 11076-11080 - [c18]Guanrou Yang, Ziyang Ma, Fan Yu, Zhifu Gao, Shiliang Zhang, Xie Chen:
MaLa-ASR: Multimedia-Assisted LLM-Based ASR. INTERSPEECH 2024 - 2023
- [c17]Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Qian Chen, Shiliang Zhang, Li-Rong Dai:
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings. APSIPA ASC 2023: 1943-1948 - [c16]Peikun Chen, Fan Yu, Yuhao Liang, Hongfei Xue, Xucheng Wan, Naijun Zheng
, Huan Zhou, Lei Xie:
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition. ASRU 2023: 1-7 - [c15]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR. ASRU 2023: 1-7 - [c14]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR. ASRU 2023: 1-8 - [c13]Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
CASA-ASR: Context-Aware Speaker-Attributed ASR. INTERSPEECH 2023: 411-415 - [c12]Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie:
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR. INTERSPEECH 2023: 3487-3491 - 2022
- [c11]Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 6167-6171 - [c10]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan
, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. ICASSP 2022: 9156-9160 - [c9]Fan Yu, Zhihao Du, Shiliang Zhang, Yuxiao Lin, Lei Xie:
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings. INTERSPEECH 2022: 560-564 - [c8]Yuxiao Lin, Zhihao Du, Shiliang Zhang, Fan Yu, Zhou Zhao, Fei Wu:
Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR. ISCSLP 2022: 150-154 - [c7]Ao Zhang, Fan Yu, Kaixun Huang, Lei Xie, Longbiao Wang, Eng Siong Chng, Hui Bu, Binbin Zhang, Wei Chen, Xin Xu:
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results. ISCSLP 2022: 507-511 - [c6]Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Yingying Gao, Lei Xie:
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge. ISCSLP 2022: 532-536 - [c5]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yuhao Liang, Zhihao Du, Yuxiao Lin, Lei Xie:
MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario. SLT 2022: 144-151 - 2021
- [c4]Fan Yu, Haoneng Luo, Pengcheng Guo, Yuhao Liang, Zhuoyuan Yao, Lei Xie, Yingying Gao, Leijing Hou, Shilei Zhang:
Boundary and Context Aware Training for CIF-Based Non-Autoregressive End-to-End ASR. ASRU 2021: 328-334 - [c3]Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, Lei Xie:
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods. ICASSP 2021: 6918-6922 - [c2]Zhuoyuan Yao, Di Wu, Xiong Wang, Binbin Zhang, Fan Yu, Chao Yang, Zhendong Peng, Xiaoyu Chen, Lei Xie, Xin Lei:
WeNet: Production Oriented Streaming and Non-Streaming End-to-End Speech Recognition Toolkit. Interspeech 2021: 4054-4058 - [c1]Fan Yu, Zhuoyuan Yao, Xiong Wang, Keyu An, Lei Xie, Zhijian Ou, Bo Liu, Xiulin Li, Guanqiong Miao:
The SLT 2021 Children Speech Recognition Challenge: Open Datasets, Rules and Baselines. SLT 2021: 1117-1123
Informal and Other Publications
- 2025
- [i26]Qian Chen, Yafeng Chen, Yanni Chen, Mengzhe Chen, Yingda Chen, Chong Deng, Zhihao Du, Ruize Gao, Changfeng Gao, Zhifu Gao, Yabin Li, Xiang Lv, Jiaqing Liu, Haoneng Luo, Bin Ma, Chongjia Ni, Xian Shi, Jialong Tang, Hui Wang, Hao Wang, Wen Wang, Yuxuan Wang, Yunlan Xu, Fan Yu, Zhijie Yan, Yexin Yang, Baosong Yang, Xian Yang, Guanrou Yang, Tianyu Zhao, Qinglin Zhang, Shiliang Zhang, Nan Zhao, Pei Zhang, Chong Zhang, Jinren Zhou:
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction. CoRR abs/2501.06282 (2025) - [i25]Guanrou Yang, Chen Yang, Qian Chen, Ziyang Ma, Wenxi Chen, Wen Wang, Tianrui Wang, Yifan Yang, Zhikang Niu, Wenrui Liu, Fan Yu, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting. CoRR abs/2504.12867 (2025) - [i24]Zhihao Du, Changfeng Gao, Yuxuan Wang, Fan Yu, Tianyu Zhao, Hao Wang, Xiang Lv, Hui Wang, Chongjia Ni, Xian Shi, Keyu An, Guanrou Yang, Yabin Li, Yanni Chen, Zhifu Gao, Qian Chen, Yue Gu, Mengzhe Chen, Yafeng Chen, Shiliang Zhang, Wen Wang, Jieping Ye:
CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training. CoRR abs/2505.17589 (2025) - 2024
- [i23]Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang:
LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition. CoRR abs/2401.06390 (2024) - [i22]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity. CoRR abs/2402.08846 (2024) - [i21]Guanrou Yang, Ziyang Ma, Fan Yu, Zhifu Gao, Shiliang Zhang, Xie Chen:
MaLa-ASR: Multimedia-Assisted LLM-Based ASR. CoRR abs/2406.05839 (2024) - [i20]Guanrou Yang, Fan Yu, Ziyang Ma, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap. CoRR abs/2410.16726 (2024) - [i19]Zhihao Du, Yuxuan Wang, Qian Chen, Xian Shi, Xiang Lv, Tianyu Zhao, Zhifu Gao, Yexin Yang, Changfeng Gao, Hui Wang, Fan Yu, Huadai Liu, Zhengyan Sheng, Yue Gu, Chong Deng, Wen Wang, Shiliang Zhang, Zhijie Yan, Jingren Zhou:
CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models. CoRR abs/2412.10117 (2024) - 2023
- [i18]Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
CASA-ASR: Context-Aware Speaker-Attributed ASR. CoRR abs/2305.12459 (2023) - [i17]Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie:
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR. CoRR abs/2305.13716 (2023) - [i16]Haoxu Wang, Fan Yu, Xian Shi, Yuezhang Wang, Shiliang Zhang, Ming Li:
SlideSpeech: A Large-Scale Slide-Enriched Audio-Visual Corpus. CoRR abs/2309.05396 (2023) - [i15]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR. CoRR abs/2309.13573 (2023) - [i14]Peikun Chen, Fan Yu, Yuhao Liang, Hongfei Xue, Xucheng Wan, Naijun Zheng
, Huan Zhou, Lei Xie:
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition. CoRR abs/2310.02629 (2023) - [i13]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR. CoRR abs/2310.04863 (2023) - [i12]Fan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang:
Hourglass-AVSR: Down-Up Sampling-based Computational Efficiency Model for Audio-Visual Speech Recognition. CoRR abs/2312.08850 (2023) - 2022
- [i11]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. CoRR abs/2202.03647 (2022) - [i10]Fan Yu, Zhihao Du, Shiliang Zhang, Yuxiao Lin, Lei Xie:
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings. CoRR abs/2203.16834 (2022) - [i9]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yuhao Liang, Zhihao Du, Yuxiao Lin, Lei Xie:
MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario. CoRR abs/2210.05265 (2022) - [i8]Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Lei Xie:
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge. CoRR abs/2210.14448 (2022) - [i7]Ao Zhang, Fan Yu, Kaixun Huang, Lei Xie, Longbiao Wang, Eng Siong Chng, Hui Bu, Binbin Zhang, Wei Chen, Xin Xu:
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results. CoRR abs/2211.01585 (2022) - 2021
- [i6]Binbin Zhang, Di Wu, Chao Yang, Xiaoyu Chen, Zhendong Peng, Xiangming Wang, Zhuoyuan Yao, Xiong Wang, Fan Yu, Lei Xie, Xin Lei:
WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit. CoRR abs/2102.01547 (2021) - [i5]Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, Lei Xie:
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods. CoRR abs/2102.10233 (2021) - [i4]Fan Yu, Haoneng Luo, Pengcheng Guo, Yuhao Liang, Zhuoyuan Yao, Lei Xie, Yingying Gao, Leijing Hou, Shilei Zhang:
Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR. CoRR abs/2104.04702 (2021) - [i3]Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. CoRR abs/2110.07393 (2021) - 2020
- [i2]Fan Yu, Zhuoyuan Yao, Xiong Wang, Keyu An, Lei Xie, Zhijian Ou, Bo Liu, Xiulin Li, Guanqiong Miao:
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines. CoRR abs/2011.06724 (2020) - [i1]Binbin Zhang, Di Wu, Zhuoyuan Yao, Xiong Wang, Fan Yu, Chao Yang, Liyong Guo, Yaguang Hu, Lei Xie, Xin Lei:
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition. CoRR abs/2012.05481 (2020)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-11-10 00:27 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint