Faculty Profiles - XUE Wei | The Hong Kong University of Science and Technology

Assistant Professor
Division of Arts and Machine Creativity
Division of Emerging Interdisciplinary Areas

Research Interest

Artificial intelligence

AI content generation

AI music

Computational audition

Speech processing

Publications

2026 9

A Systematic Review of Context-Aware and AI-Driven Home Energy and Comfort Management

EBES Sustainable Building, v. 1, (1), article number 100007
GAO, Jiajing; SUN, Cheng; ZHUANG, Dian; XUE, Wei; XIANG, Changying
Article

CoCoGesture: Towards coherent co-speech 3D gesture generation in the wild

Information Fusion, v. 126, article number 103613
Qi, Xingqun; Zhang, Hengyuan; Wang, Yatian; Pan, Jiahao; Liu, Chen; Sun, Muyi; Xue, Wei; Zhang, Shanghang; Han, Sirui; Liu, Qifeng; Guo, Yike
Article

HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts

International Journal of Computer Vision, v. 134, (4), article number 147
Liu, Xinyu; He, Yingqing; Guo, Lanqing; Li, Xiang; Jin, Bu; Li, Yan; Chan, Chi Min; Xue, Wei; Luo, Wenhan; Liu, Qifeng; Guo, Yike
Article

AudioX: A Unified Framework for Anything-to-Audio Generation

Paper presented at The 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil
TIAN, Zeyue; LIU, Zhaoyang; JIN, Yizhu; YUAN, Ruibin; Xue, Liumeng; Tan, Xu; CHEN, Qifeng; XUE, Wei; GUO, Yike
Conference paper

Inference-time Scaling for Diffusion-based Audio Super-resolution

Proceedings of the AAAI Conference on Artificial Intelligence / edited by Koenig Sven; Jenkins Chad; Taylor Matthew E.. Association for the Advancement of Artificial Intelligence, 2026, p. 14982-14990
Jin, Yizhu; Ye, Zhen; Tian, Zeyue; Liu, Haohe; Kong, Qiuqiang; Guo, Yike; Xue, Wei
Conference paper

Lighthouse: A Self-Reconfiguring Sociotechnical Infrastructure for the Unforeseen Long-Tail of Urban Crisis

CHI EA '26: Proceedings of the Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems / edited by Oliver Nuria; Shamma David A.; Candello Heloisa; Cesar Pablo; Lopes Pedro; Artizzu Valentino; Draxler Fiona; Lopez Gustavo; Reinschluessel Anke V.; Tong Xin; Toups Dugas Phoebe O.. Association for Computing Machinery, 2026, p. 1–6article number 434
DENG, Ning; ZHOU, Qinyi; ZHUO, Yuchao; Xue, Wei; GUO, Yike
Conference paper

Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks

14th International Conference on Learning Representations, ICLR 2026, 2026,
Jiang, Chunyang; Zhang, Yonggang; Cai, Yiyang; Chan, Chi-Min; Liu, Yulong; Chen, Mingming; Xue, Wei; Guo, Yike
Conference paper

VMChill: A Dataset for Fine-Grained Visual-Musical Synergy

Proceedings of the AAAI Conference on Artificial Intelligence, v. 40, (5), p. 3353-3362
Chi, Xiaowei; Tian, Zeyue; Chen, Jialiang; Xue, Wei
Conference paper

WenetSpeech-Yue: A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Proceedings of the AAAI Conference on Artificial Intelligence, v. 40, (37), p. 31627-31635
Li, Longhao; Guo, Zhao; Chen, Hongjie; Dai, Yuhang; Zhang, Ziyu; Xue, Hongfei; Zuo, Tianlun; Wang, Chengyou; Wang, Shuiyuan; Xu, Xin; Bu, Hui; Li, Jie; Kang, Jian; Zhang, Binbin; Yuan, Ruibin; Zhou, Ziya; Xue, Wei; Xie, Lei
Conference paper

2025 28

Enhancing target speaker extraction with Hierarchical Speaker Representation Learning

Neural Networks, v. 188, article number 107388
He, Shulin; Xue, Wei; Yang, Yang; Zhang, Huaiwen; PAN, Jiahao; Zhang, Xueliang
Article

Every Angle is Worth a Second Glance: Mining Kinematic Skeletal Structures From Multi-View Joint Cloud

IEEE Transactions on Visualization and Computer Graphics, v. 31, (10), p. 7337-7349, article number 10902182
Jiang, Junkun; Chen, Jie; Au, Ho Yin; Chen, Mingyuan; Xue, Wei; Guo, Yike
Article

AttnZero: Efficient Attention Discovery for Vision Transformers

Computer Vision – ECCV 2024 - 18th European Conference, Proceedings / edited by Leonardis Aleš; Ricci Elisa; Roth Stefan; Russakovsky Olga; Sattler Torsten; Varol Gül. Springer Science and Business Media Deutschland GmbH, 2025, p. 20-37
Li, Lujun; Wei, Zimian; Dong, Peijie; Luo, Wenhan; Xue, Wei; Liu, Qifeng; Guo, Yike
Conference paper

Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search

Computer Vision – ECCV 2024 - 18th European Conference, Proceedings / edited by Leonardis Aleš; Ricci Elisa; Roth Stefan; Russakovsky Olga; Sattler Torsten; Varol Gül. Springer Science and Business Media Deutschland GmbH, 2025, p. 38-55
Li, Lujun; Sun, Haosen; Li, Shiwen; Dong, Peijie; Luo, Wenhan; Xue, Wei; Liu, Qifeng; Guo, Yike
Conference paper

BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios

Findings of the Association for Computational Linguistics / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 138-152
Li, Wei; Li, Lujun; Lee, Mark; Sun, Shengjie; Zhang, Lei; Xue, Wei; Guo, Yike
Conference paper

Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA

Findings of the Association for Computational Linguistics: ACL 2025 / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 7433-7451
CHAN, Chi-min; XU, Chunpu; ZHU, Junqi; JI, Jiaming; HONG, Donghai; WEN, Pengcheng; JIANG, Chunyang; YE, Zhen; YANG, Yaodong; XUE, Wei; HAN, Sirui; GUO, Yike
Conference paper

Both Ears Wide Open: TOWARDS LANGUAGE-DRIVEN SPATIAL AUDIO GENERATION

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 72147-72190
Sun, Peiwen; Cheng, Sitong; Li, Xiangtai; Ye, Zhen; Liu, Huadai; Zhang, Honggang; Xue, Wei; Guo, Yike
Conference paper

CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation

Proceedings - SIGGRAPH 2025 Conference Papers / edited by Spencer Stephen N.. Association for Computing Machinery, Inc, 2025, article number 81
Li, Peng; Ma, Suizhi; Chen, Jialiang; Liu, Yuan; Zhang, Congyi; Xue, Wei; Luo, Wenhan; Sheffer, Alla; Wang, Wenping; Guo, Yike
Conference paper

Co³Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 88359-88377
Qi, Xingqun; Wang, Yatian; Zhang, Hengyuan; Pan, Jiahao; Xue, Wei; Zhang, Shanghang; Luo, Wenhan; Liu, Qifeng; Guo, Yike
Conference paper

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Proceedings of the AAAI Conference on Artificial Intelligence, v. 39, (24), p. 25697-25705
Ye, Zhen; Sun, Peiwen; Lei, Jiahe; Lin, Hongzhan; Tan, Xu; Dai, Zheqi; Kong, Qiuqiang; Chen, Jianyi; Pan, Jiahao; Liu, Qifeng; Guo, Yike; Xue, Wei
Conference paper

Delta Decompression for MoE-based LLMs Compression

Proceedings of Machine Learning Research, v. 267, p. 20497-20514
Gu, Hao; Li, Wei; Li, Lujun; Zhu, Qiyuan; Lee, Mark; Sun, Shengjie; Xue, Wei; Guo, Yike
Conference paper

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Hou, Siyuan; Liu, Shansong; Yuan, Ruibin; Xue, Wei; Shan, Ying; Zhao, Mangsuo; Zhang, Chao
Conference paper

Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation

Proceedings of IEEE International Conference on Computer Vision (ICCV), IEEE, 2025, p. 22252-22262
LI, Lujun; LIN, Cheng; LI, Dezhi; Huang, You Liang; LI, Wei; Wu, Tianyu; ZHOU, Jie; XUE, Wei; HAN, Sirui; GUO, Yike
Conference paper

Empowering World Models with Reflection for Embodied Video Prediction

Proceedings of Machine Learning Research, v. 267, p. 10383-10410
Chi, Xiaowei; Fan, Chun Kai; Zhang, Hengyuan; Qi, Xingqun; Zhang, Rongyu; Chen, Anthony; Chan, Chi Min; Xue, Wei; Liu, Qifeng; Zhang, Shanghang; Guo, Yike
Conference paper

EVA: An Embodied World Model for Future Video Anticipation

Paper presented at 42nd International Conference on Machine Learning, Vancouver, Canada
CHI, Xiaowei; FAN, Chun-kai; ZHANG, Hengyuan; QI, Xingqun; ZHANG, Rongyu; CHEN, Anthony; CHAN, Chi-min; XUE, Wei; LIU, Qifeng; ZHANG, Shanghang; GUO, Yike
Conference paper

FlashAudio: Rectified Flows for Fast and High-fidelity Text-to-Audio Generation

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 13694-13710
Liu, Huadai; Wang, Jialei; Huang, Rongjie; Liu, Yang; Lu, Heng; Zhao, Zhou; Xue, Wei
Conference paper

Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge

Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS 2025), 2025,
CAI, Yiyang; Jiang, Zhengkai; LIU, Yulong; JIANG, Chunyang; XUE, Wei; GUO, Yike; LUO, Wenhan
Conference paper

Graceful Forgetting in Generative Language Models

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing / edited by Christodoulopoulos Christos; Chakraborty Tanmoy; Rose Carolyn; Peng Violet. Association for Computational Linguistics (ACL), 2025, p. 13165-13180
Jiang, Chunyang; Chan, Chi-min; Cai, Yiyang; Liu, Yulong; Xue, Wei; Guo, Yike
Conference paper

Importance Weighting Can Help Large Language Models Self-Improve

Proceedings of the AAAI Conference on Artificial Intelligence, v. 39, (23), p. 24257-24265
Jiang, Chunyang; Chan, Chi Min; Xue, Wei; Liu, Qifeng; Guo, Yike
Conference paper

LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 13292-13305
Kang, Boyi; Zhu, Xinfa; Zhang, Zihan; Ye, Zhen; Liu, Mingshuai; Wang, Ziqian; Zhu, Yike; Ma, Guobin; Chen, Jun; Xiao, Longshuai; Weng, Chao; Xue, Wei; Xie, Lei
Conference paper

MelodyEdit: Zero-shot Music Editing with Disentangled Inversion Control

MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025, Association for Computing Machinery, Inc, 2025, p. 10083-10092
Liu, Huadai; Wang, Jialei; Li, Xiangtai; Wang, Wen; Chen, Qian; Huang, Rongjie; Liu, Yang; Xu, Jiayang; Zhao, Zhou; Xue, Wei
Conference paper

MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition

Proceedings of Machine Learning Research, v. 267, p. 35209-35230
Li, Wei; Li, Lujun; Gu, Hao; Huang, You Liang; Lee, Mark; Sun, Shengjie; Xue, Wei; Guo, Yike
Conference paper

MUPT: A GENERATIVE SYMBOLIC MUSIC PRE-TRAINED TRANSFORMER

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 44591-44617
Qu, Xingwei; Bai, Yuelin; Ma, Yinghao; Zhou, Ziya; Lo, Ka Man; Liu, Jiaheng; Yuan, Ruibin; Min, Lejun; Liu, Xueling; Zhang, Tianyu; Du, Xinrun; Guo, Shuyue; Liang, Yiming; Li, Yizhi; Wu, Shangda; Zhou, Junting; Zheng, Tianyu; Ma, Ziyang; Han, Fengze; Xue, Wei; Xia, Gus; Benetos, Emmanouil; Yue, Xiang; Lin, Chenghua; Tan, Xu; Huang, Wenhao; Fu, Jie; Zhang, Ge
Conference paper

OmniAudio: Generating Spatial Audio from 360-Degree Video

Proceedings of Machine Learning Research, v. 267, p. 39060-39084
Liu, Huadai; Luo, Tianyi; Luo, Kaicheng; Jiang, Qikai; Sun, Peiwen; Wang, Jialei; Huang, Rongjie; Chen, Qian; Wang, Wen; Li, Xiangtai; Zhang, Shiliang; Yan, Zhijie; Zhao, Zhou; Xue, Wei
Conference paper

PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2025, article number 11094961
LI, Peng; Zheng, Wangguandong; LIU, Yuan; Yu, Tao; Li, Yangguang; QI, Xingqun; CHI, Xiaowei; Xia, Siyu; Cao, Yanpei; XUE, Wei; LUO, Wenhan; GUO, Yike
Conference paper

STBLLM: BREAKING THE 1-BIT BARRIER WITH STRUCTURED BINARY LLMS

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 12014-12039
Dong, Peijie; Li, Lujun; Zhong, Yuedong; Du, Dayou; Fan, Ruibo; Chen, Yuhan; Tang, Zhenheng; Wang, Qiang; Xue, Wei; Guo, Yike; Chu, Xiaowen
Conference paper

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Advances in Neural Information Processing Systems, 2025,
Liu, Huadai; Luo, Kaicheng; Wang, Jialei; Wang, Wen; Chen, Qian; Zhao, Zhou; Xue, Wei
Conference paper

Vidmuse: A simple video-to-music generation framework with long-short-term modeling

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers Inc., 2025, p. 18782-18793article number 11094002
TIAN, Zeyue; LIU, Zhaoyang; YUAN, Ruibin; PAN, Jiahao; LIU, Qifeng; TAN, Xu; CHEN, Qifen; XUE, Wei; GUO, Yike
Conference paper

2024 16

Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech

IEEE Transactions on Multimedia, v. 26, p. 4480-4489
Qian, Xinyuan; Xue, Wei; Zhang, Qiquan; Tao, Ruijie; Li, Haizhou
Article

Can LLMs" Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation

Paper presented at International Society for Music Information Retrieval Conference (ISMIR 2024)
Wu, Yuhang; Zhang, Xinyue; Benetos, Emmanouil; Ma, Yinghao; Wang, LU; Wu, Zhiyue; Guo, Yike; Xue, Wei; Yuan, Ruibin; Zhou, Ziya
Conference paper

CHATEVAL: TOWARDS BETTER LLM-BASED EVALUATORS THROUGH MULTI-AGENT DEBATE

Paper presented at 12th International Conference on Learning Representations, ICLR 2024, Hybrid, Vienna, Austria
Chan, Chi Min; Chen, Weize; Su, Yusheng; Yu, Jianxuan; Xue, Wei; Zhang, Shanghang; Fu, Jie; Liu, Zhiyuan
Conference paper

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper presented at 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
Benetos, Emmanouil; Chen, Wenhu; Huang, Wenhao; Jiang, Tao; Kang, Shiyin; Li, Yizhi; Li, Pengfei; Liang, Yiming; Lin, Chenghua; Liu, Cong; Liu, Qin; Liu, Ruibo; Ma, Ziyang; Ma, Yinghao; Shen, Tianhao; Wang, Ziyu; Wang, Zili; Wu, Yuhang; Wu, Jingcheng; Xia, Gus; Xue, Liumeng; Zhang, Ge; Zheng, Tianyu; Lin, Hanfeng; Wu, Shangda; Dannerberg, Roger; Wang, Yi; Liu, Qifeng; Chi, Xiaowei; Fu, Jie; Guo, Yike; Tian, Zeyue; Xue, Wei; Yuan, Ruibin; Zhou, Ziya
Conference paper

ChatMusician: Understanding and Generating Music Intrinsically with LLMs

The 62nd Annual Meeting of the Association for Computational Linguistics / edited by Ku Lun-Wei; Martins Andre; Srikumar Vivek. Association for Computational Linguistics (ACL), 2024, p. 6252-6271
Multimodal Art Projection Research Community; Yuan, Ruibin; Lin, Hanfeng; Wang, Yi; Tian, Zeyue; Wu, Shangda; Shen, Tianhao; Zhang, Ge; Wu, Yuhang; Liu, Cong; Zhou, Ziya; Xue, Liumeng; Ma, Ziyang; Liu, Qin; Zheng, Tianyu; Li, Yizhi; Ma, Yinghao; Liang, Yiming; Chi, Xiaowei; Liu, Ruibo; Wang, Zili; Lin, Chenghua; Liu, Qifeng; Jiang, Tao; Huang, Wenhao; Chen, Wenhu; Fu, Jie; Benetos, Emmanouil; Xia, Gus; Dannenberg, Roger; Xue, Wei; Kang, Shiyin; Guo, Yike
Conference paper

COMOSVC: Consistency Model-based Singing Voice Conversion

2024 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 / edited by Qian Yanmin; Jin Qin; Ou Zhijian; Ling Zhenhua; Wu Zhiyong; Li Ya; Xie Lei; Tao Jianhua. Institute of Electrical and Electronics Engineers Inc., 2024, p. 184-188
Lu, Yiwen; Ye, Zhen; Xue, Wei; Tan, Xu; Liu, Qifeng; Guo, Yike
Conference paper

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

Paper presented at International Society for Music Information Retrieval Conference (ISMIR 2024)
Zhang, Ge; Li, Yizhi; Lin, Chenghua; Xia, Guangyu; Huang, Yipeng; Lin, Hanfeng; Wang, Yi; Benetos, Emmanouil; Ma, Yinghao; Yang, Qikai; Deng, Qixin; Liu, Xubo; Wang, Wenwu; Fu, Jie; Guo, Yike; Pan, Jiahao; Tian, Zeyue; Xue, Wei; Yuan, Ruibin
Conference paper

Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models

Advances in Neural Information Processing Systems, v. 37
Li, Lujun; Dong, Peijie; Tang, Zhenheng; Liu, Xiang; Wang, Qiang; Luo, Wenhan; Xue, Wei; Liu, Qifeng; Chu, Xiaowen; Guo, Yike
Conference paper

FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation

Paper presented at 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)
Liu, Qifeng; Tan, Xu; Chen, Jianyi; Guo, Yike; Xue, Wei; Ye, Zhen
Conference paper

FlashSpeech: Efficient Zero-Shot Speech Synthesis

MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia, Association for Computing Machinery, Inc, 2024, p. 6998-7007
Ye, Zhen; Ju, Zeqian; Liu, Haohe; Tan, Xu; Chen, Jianyi; Lu, Yiwen; Sun, Peiwen; Pan, Jiahao; Bian, Weizhen; He, Shulin; Xue, Wei; Liu, Qifeng; Guo, Yike
Conference paper

FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection

Proceedings of the AAAI Conference on Artificial Intelligence, v. 38, (15), p. 16723-16731
Zhang, Dongmei; Li, Chang; Zhang, Renrui; Xie, Shenghao; Xue, Wei; Xie, Xiaodong; Zhang, Shanghang
Conference paper

Generated Therapeutic Music Based on the ISO Principle

Music Intelligence - 2nd Summit, SOMI 2023, Revised Selected Papers / edited by Li Xiaobing; Guan Xiaohong; Tie Yun; Zhang Xinran; Zhou Qingwen. Springer Science and Business Media Deutschland GmbH, 2024, p. 32-45
Qiu, Zipeng; Yuan, Ruibin; Xue, Wei; Jin, Yucheng
Conference paper

PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain

EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024 / edited by Al-Onaizan Yaser; Bansal Mohit; Chen Yun-Nung. Association for Computational Linguistics (ACL), 2024, p. 4253-4263
Chen, Jianyi; Dai, Zheqi; Ye, Zhen; Tan, Xu; Liu, Qifeng; Guo, Yike; Xue, Wei
Conference paper

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

Paper presented at Conference on Language Modeling (COLM 2024)
Xu, Chunpu; Luo, Hongyin; Chan, Chi-min; Fu, Jie; Guo, Yike; Xue, Wei; Yuan, Ruibin
Conference paper

VIDA: HOMEOSTATIC VISUAL DOMAIN ADAPTER FOR CONTINUAL TEST TIME ADAPTATION

Paper presented at 12th International Conference on Learning Representations, ICLR 2024, Hybrid, Vienna, Austria
Liu, Jiaming; Yang, Senqiao; Jia, Peidong; Zhang, Renrui; Lu, Ming; Guo, Yandong; Xue, Wei; Zhang, Shanghang
Conference paper

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

Paper presented at The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024
Liu, Qifeng; Zhang, Shanghang; Chi, Xiaowei; Guo, Yike; Li, Peng; Li, Mengfei; Luo, Wenhan; Pan, Jiahao; Qi, Xingqun; Xue, Wei; Yuan, Ruibin
Conference paper

2023 5

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, Association for Computing Machinery, Inc, 2023, p. 1831-1839
Ye, Zhen; Xue, Wei; Tan, Xu; Chen, Jie; Liu, Qifeng; Guo, Yike
Conference paper

GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023,
Li, Guanjun; Xue, Wei; Liu, Wenju; Yi, Jiangyan; Tao, Jianhua
Conference paper

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023 / edited by Oh A.; Neumann T.; Globerson A.; Saenko K.; Hardt M.; Levine S.. Neural information processing systems foundation, 2023,
Yuan, Ruibin; Ma, Yinghao; Li, Yizhi; Zhang, Ge; Chen, Xingran; Yin, Hanzhi; Zhuo, Le; Liu, Yiqi; Huang, Jiawen; Tian, Zeyue; Deng, Binyue; Wang, Ningzhi; Lin, Chenghua; Benetos, Emmanouil; Ragni, Anton; Gyenge, Norbert; Dannenberg, Roger; Chen, Wenhu; Xia, Gus; Xue, Wei; Liu, Si; Wang, Shi; Liu, Ruibo; Guo, Yike; Fu, Jie
Conference paper

MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System

Paper presented at 13th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 2023)
Bian, Weizhen; Chan, Tin Yan; Gu, Nianzhen; Li, Tsun Sun; Lo, Tsz To; Song, Yijin; Trillo, Roberto Alonso; Wong, King Chak; Xue, Wei
Conference paper

NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis Based on Frequency Modulation

Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 / edited by Elkind Edith. International Joint Conferences on Artificial Intelligence, 2023, p. 5869-5877
Ye, Zhen; Xue, Wei; Tan, Xu; Liu, Qifeng; Guo, Yike
Conference paper

2022 2

Deep Audio-Visual Beamforming for Speaker Localization

IEEE Signal Processing Letters, v. 29, p. 1132-1136
Qian, Xinyuan; Zhang, Qiquan; Guan, Guohui; Xue, Wei
Article

Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement

Trends in Hearing, v. 26
Green, Tim; Hilkhuysen, Gaston; Huckvale, Mark; Rosen, Stuart; Brookes, Mike; Moore, Alastair; Naylor, Patrick; Lightburn, Leo; Xue, Wei
Article

2021 3

Speech Enhancement Based on Modulation-Domain Parametric Multichannel Kalman Filtering

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 29, p. 393-405, article number 9272832
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Article

Causal System Identification based Compensation for Reverberation-Robust DOA Estimation

29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings, European Signal Processing Conference, EUSIPCO, 2021, p. 1885-1889
He, Li; Xue, Wei
Conference paper

Neural Kalman filtering for speech enhancement

2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2021, p. 7108-7112
Xue, Wei; Quan, Gang; Zhang, Chao; Ding, Guohong; He, Xiaodong; Zhou, Bowen
Conference paper

2020 4

Frame-GAN: Increasing the frame rate of gait videos with generative adversarial networks

Neurocomputing, v. 380, p. 95-104
Xue, Wei; Ai, Hong; Sun, Tianyu; Song, Chunfeng; Huang, Yan; Wang, Liang
Article

SkipConvNet: Skip convolutional neural network for speech dereverberation using optimally smoothed spectral mapping

Interspeech 2020, International Speech Communication Association, 2020, p. 3935-3939
Kothapally, Vinay; Xia, Wei; Ghorbani, Shahram; Hansen, John H.L.; Xue, Wei; Huang, Jing
Conference paper

Sound event localization and detection based on multiple DOA beamforming and multi-task learning

Interspeech 2020, International Speech Communication Association, 2020, p. 5091-5095
Xue, Wei; Tong, Ying; Zhang, Chao; Ding, Guohong; He, Xiaodong; Zhou, Bowen
Conference paper

The JD AI speaker verification system for the FFSVC 2020 challenge

Interspeech 2020, International Speech Communication Association, 2020, p. 3476-3480
Tong, Ying; Xue, Wei; Huang, Shanluo; Fan, Lu; Zhang, Chao; Ding, Guohong; He, Xiaodong
Conference paper

2019 2

Noise covariance matrix estimation for rotating microphone arrays

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 27, (3), p. 519-530, article number 8540424
Moore, Alastair H.; Xue, Wei; Naylor, Patrick A.; Brookes, Mike
Article

Direct-path signal cross-correlation estimation for sound source localization in reverberation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2019-September, p. 2693-2697
Xue, Wei; Tong, Ying; Ding, Guohong; Zhang, Chao; Ma, Tao; He, Xiaodong; Zhou, Bowen
Conference paper

2018 5

Modulation-domain multichannel kalman filtering for speech enhancement

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 26, (10), p. 1833-1847
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Article

Binaural mask-informed speech enhancement for hearing AIDS with head tracking

16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2018, p. 461-465article number 8521361
Moore, Alastair H.; Lightburn, Leo; Xue, Wei; Naylor, Patrick A.; Brookes, Mike
Conference paper

Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays

Conference Record of the 52nd Asilomar Conference on Signals, Systems and Computers, ACSSC 2018 / edited by Matthews Michael B.. IEEE Computer Society, 2018, p. 1936-1941article number 8645397
Moore, Alastair H.; Xue, Wei; Naylor, Patrick A.; Brookes, Mike
Conference paper

Modulation-domain parametric multichannel Kalman filtering for speech enhancement

2018 26th European Signal Processing Conference, EUSIPCO 2018, European Signal Processing Conference, EUSIPCO, 2018, p. 2509-2513article number 8552954
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Conference paper

Multichannel Kalman Filtering for Speech Ehnancement

2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2018, p. 41-45article number 8461903
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Conference paper

2017 3

Frequency-domain under-modelled blind system identification based on cross power spectrum and sparsity regularization

2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2017, p. 591-595article number 7952224
Xue, Wei; Brookes, Mike; Naylor, Patrick A.
Conference paper

Long short-term memory recurrent neural network based segment features for music genre classification

Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016 / edited by Wang Hsin-Min; Hou Qingzhi; Wei Yuan; Lee Tan; Wei Jianguo; Xie Lei; Feng Hui; Dang Jianwu; Dang Jianwu. Institute of Electrical and Electronics Engineers Inc., 2017, article number 7918369
Dai, Jia; Liang, Shan; Xue, Wei; Ni, Chongjia; Liu, Wenju
Conference paper

Multilingual I-vector based statistical modeling for music genre classification

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2017-August, p. 459-463
Dai, Jia; Xue, Wei; Liu, Wenju
Conference paper

2016 4

A novel codebook representation method and encoding strategy for bag-of-words based acoustic event classification

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015, Institute of Electrical and Electronics Engineers Inc., 2016, p. 31-34article number 7415326
Dai, Jia; Ni, Chongjia; Xue, Wei; Liu, Wenju
Conference paper

Cross-correlation based under-modelled multichannel blind acoustic system identification with sparsity regularization

2016 24th European Signal Processing Conference, EUSIPCO 2016, European Signal Processing Conference, EUSIPCO, 2016, p. 718-722article number 7760342
Xue, Wei; Brookes, Mike; Naylor, Patrick A.
Conference paper

Semi-supervised learning of bottleneck feature for music genre classification

Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings / edited by Tan Tieniu; Chen Xilin; Li Xuelong; Yang Jian; Cheng Hong; Zhou Jie. Springer Verlag, 2016, p. 552-562
Dai, Jia; Liu, Wenju; Zheng, Hao; Xue, Wei; Ni, Chongjia
Conference paper

Under-modelled blind system identification for time delay estimation in reverberant environments

2016 International Workshop on Acoustic Signal Enhancement, IWAENC 2016, Institute of Electrical and Electronics Engineers Inc., 2016, article number 7602923
Xue, Wei; Brookes, Mike; Naylor, Patrick A.
Conference paper

2015 3

Noise Robust Direction of Arrival Estimation for Speech Source with Weighted Bispectrum Spatial Correlation Matrix

IEEE Journal on Selected Topics in Signal Processing, v. 9, (5), p. 837-851, article number 7067389
Xue, Wei; Liu, Wenju; Liang, Shan
Article

Joint optimization of recurrent networks exploiting source auto-regression for source separation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2015-January, p. 3307-3311
Nie, Shuai; Xue, Wei; Liang, Shan; Zhang, Xueliang; Liu, Wenju; Qiao, Liwei; Li, Jianping
Conference paper

Two-stage multi-target joint learning for monaural speech separation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2015-January, p. 1503-1507
Nie, Shuai; Liang, Shan; Xue, Wei; Zhang, Xueliang; Liu, Wenju; Dong, Like; Yang, Hong
Conference paper

2014 3

The analysis of the simplification from the ideal ratio to binary mask in signal-to-noise ratio sense

Speech Communication, v. 59, p. 22-30
Liang, Shan; Liu, Wenju; Jiang, Wei; Xue, Wei
Article

DOA estimation of speech source in noisy environments with weighted spatial bispectrum correlation matrix

2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Institute of Electrical and Electronics Engineers Inc., 2014, p. 2282-2286article number 6854006
Xue, Wei; Liang, Shan; Liu, Wenju
Conference paper

Weighted spatial bispectrum correlation matrix for DOA estimation in the presence of interferences

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, p. 2228-2232
Xue, Wei; Liang, Shan; Liu, Wenju
Conference paper

2013 2

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio

Journal of the Acoustical Society of America, v. 134, (5), p. EL452-EL458
Liang, Shan; Liu, Wenju; Jiang, Wei; Xue, Wei
Article

Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, p. 2895-2899
Xue, Wei; Liang, Shan; Liu, Wenju
Conference paper

2012 1

Direction of arrival estimation based on subband weighting for noisy conditions

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, International Speech Communication Association, 2012, p. 142-145
Xue, Wei; Liu, Wenju
Conference paper

Article 3

A Systematic Review of Context-Aware and AI-Driven Home Energy and Comfort Management

EBES Sustainable Building, v. 1, (1), article number 100007
GAO, Jiajing; SUN, Cheng; ZHUANG, Dian; XUE, Wei; XIANG, Changying

CoCoGesture: Towards coherent co-speech 3D gesture generation in the wild

Information Fusion, v. 126, article number 103613
Qi, Xingqun; Zhang, Hengyuan; Wang, Yatian; Pan, Jiahao; Liu, Chen; Sun, Muyi; Xue, Wei; Zhang, Shanghang; Han, Sirui; Liu, Qifeng; Guo, Yike

HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts

International Journal of Computer Vision, v. 134, (4), article number 147
Liu, Xinyu; He, Yingqing; Guo, Lanqing; Li, Xiang; Jin, Bu; Li, Yan; Chan, Chi Min; Xue, Wei; Luo, Wenhan; Liu, Qifeng; Guo, Yike

Conference paper 6

AudioX: A Unified Framework for Anything-to-Audio Generation

Paper presented at The 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil
TIAN, Zeyue; LIU, Zhaoyang; JIN, Yizhu; YUAN, Ruibin; Xue, Liumeng; Tan, Xu; CHEN, Qifeng; XUE, Wei; GUO, Yike

Inference-time Scaling for Diffusion-based Audio Super-resolution

Proceedings of the AAAI Conference on Artificial Intelligence / edited by Koenig Sven; Jenkins Chad; Taylor Matthew E.. Association for the Advancement of Artificial Intelligence, 2026, p. 14982-14990
Jin, Yizhu; Ye, Zhen; Tian, Zeyue; Liu, Haohe; Kong, Qiuqiang; Guo, Yike; Xue, Wei

Lighthouse: A Self-Reconfiguring Sociotechnical Infrastructure for the Unforeseen Long-Tail of Urban Crisis

CHI EA '26: Proceedings of the Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems / edited by Oliver Nuria; Shamma David A.; Candello Heloisa; Cesar Pablo; Lopes Pedro; Artizzu Valentino; Draxler Fiona; Lopez Gustavo; Reinschluessel Anke V.; Tong Xin; Toups Dugas Phoebe O.. Association for Computing Machinery, 2026, p. 1–6article number 434
DENG, Ning; ZHOU, Qinyi; ZHUO, Yuchao; Xue, Wei; GUO, Yike

Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks

14th International Conference on Learning Representations, ICLR 2026, 2026,
Jiang, Chunyang; Zhang, Yonggang; Cai, Yiyang; Chan, Chi-Min; Liu, Yulong; Chen, Mingming; Xue, Wei; Guo, Yike

VMChill: A Dataset for Fine-Grained Visual-Musical Synergy

Proceedings of the AAAI Conference on Artificial Intelligence, v. 40, (5), p. 3353-3362
Chi, Xiaowei; Tian, Zeyue; Chen, Jialiang; Xue, Wei

WenetSpeech-Yue: A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Proceedings of the AAAI Conference on Artificial Intelligence, v. 40, (37), p. 31627-31635
Li, Longhao; Guo, Zhao; Chen, Hongjie; Dai, Yuhang; Zhang, Ziyu; Xue, Hongfei; Zuo, Tianlun; Wang, Chengyou; Wang, Shuiyuan; Xu, Xin; Bu, Hui; Li, Jie; Kang, Jian; Zhang, Binbin; Yuan, Ruibin; Zhou, Ziya; Xue, Wei; Xie, Lei

Article 2

Enhancing target speaker extraction with Hierarchical Speaker Representation Learning

Neural Networks, v. 188, article number 107388
He, Shulin; Xue, Wei; Yang, Yang; Zhang, Huaiwen; PAN, Jiahao; Zhang, Xueliang

Every Angle is Worth a Second Glance: Mining Kinematic Skeletal Structures From Multi-View Joint Cloud

IEEE Transactions on Visualization and Computer Graphics, v. 31, (10), p. 7337-7349, article number 10902182
Jiang, Junkun; Chen, Jie; Au, Ho Yin; Chen, Mingyuan; Xue, Wei; Guo, Yike

Conference paper 26

AttnZero: Efficient Attention Discovery for Vision Transformers

Computer Vision – ECCV 2024 - 18th European Conference, Proceedings / edited by Leonardis Aleš; Ricci Elisa; Roth Stefan; Russakovsky Olga; Sattler Torsten; Varol Gül. Springer Science and Business Media Deutschland GmbH, 2025, p. 20-37
Li, Lujun; Wei, Zimian; Dong, Peijie; Luo, Wenhan; Xue, Wei; Liu, Qifeng; Guo, Yike

Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search

Computer Vision – ECCV 2024 - 18th European Conference, Proceedings / edited by Leonardis Aleš; Ricci Elisa; Roth Stefan; Russakovsky Olga; Sattler Torsten; Varol Gül. Springer Science and Business Media Deutschland GmbH, 2025, p. 38-55
Li, Lujun; Sun, Haosen; Li, Shiwen; Dong, Peijie; Luo, Wenhan; Xue, Wei; Liu, Qifeng; Guo, Yike

BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios

Findings of the Association for Computational Linguistics / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 138-152
Li, Wei; Li, Lujun; Lee, Mark; Sun, Shengjie; Zhang, Lei; Xue, Wei; Guo, Yike

Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA

Findings of the Association for Computational Linguistics: ACL 2025 / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 7433-7451
CHAN, Chi-min; XU, Chunpu; ZHU, Junqi; JI, Jiaming; HONG, Donghai; WEN, Pengcheng; JIANG, Chunyang; YE, Zhen; YANG, Yaodong; XUE, Wei; HAN, Sirui; GUO, Yike

Both Ears Wide Open: TOWARDS LANGUAGE-DRIVEN SPATIAL AUDIO GENERATION

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 72147-72190
Sun, Peiwen; Cheng, Sitong; Li, Xiangtai; Ye, Zhen; Liu, Huadai; Zhang, Honggang; Xue, Wei; Guo, Yike

CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation

Proceedings - SIGGRAPH 2025 Conference Papers / edited by Spencer Stephen N.. Association for Computing Machinery, Inc, 2025, article number 81
Li, Peng; Ma, Suizhi; Chen, Jialiang; Liu, Yuan; Zhang, Congyi; Xue, Wei; Luo, Wenhan; Sheffer, Alla; Wang, Wenping; Guo, Yike

Co³Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 88359-88377
Qi, Xingqun; Wang, Yatian; Zhang, Hengyuan; Pan, Jiahao; Xue, Wei; Zhang, Shanghang; Luo, Wenhan; Liu, Qifeng; Guo, Yike

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Proceedings of the AAAI Conference on Artificial Intelligence, v. 39, (24), p. 25697-25705
Ye, Zhen; Sun, Peiwen; Lei, Jiahe; Lin, Hongzhan; Tan, Xu; Dai, Zheqi; Kong, Qiuqiang; Chen, Jianyi; Pan, Jiahao; Liu, Qifeng; Guo, Yike; Xue, Wei

Delta Decompression for MoE-based LLMs Compression

Proceedings of Machine Learning Research, v. 267, p. 20497-20514
Gu, Hao; Li, Wei; Li, Lujun; Zhu, Qiyuan; Lee, Mark; Sun, Shengjie; Xue, Wei; Guo, Yike

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Hou, Siyuan; Liu, Shansong; Yuan, Ruibin; Xue, Wei; Shan, Ying; Zhao, Mangsuo; Zhang, Chao

Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation

Proceedings of IEEE International Conference on Computer Vision (ICCV), IEEE, 2025, p. 22252-22262
LI, Lujun; LIN, Cheng; LI, Dezhi; Huang, You Liang; LI, Wei; Wu, Tianyu; ZHOU, Jie; XUE, Wei; HAN, Sirui; GUO, Yike

Empowering World Models with Reflection for Embodied Video Prediction

Proceedings of Machine Learning Research, v. 267, p. 10383-10410
Chi, Xiaowei; Fan, Chun Kai; Zhang, Hengyuan; Qi, Xingqun; Zhang, Rongyu; Chen, Anthony; Chan, Chi Min; Xue, Wei; Liu, Qifeng; Zhang, Shanghang; Guo, Yike

EVA: An Embodied World Model for Future Video Anticipation

Paper presented at 42nd International Conference on Machine Learning, Vancouver, Canada
CHI, Xiaowei; FAN, Chun-kai; ZHANG, Hengyuan; QI, Xingqun; ZHANG, Rongyu; CHEN, Anthony; CHAN, Chi-min; XUE, Wei; LIU, Qifeng; ZHANG, Shanghang; GUO, Yike

FlashAudio: Rectified Flows for Fast and High-fidelity Text-to-Audio Generation

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 13694-13710
Liu, Huadai; Wang, Jialei; Huang, Rongjie; Liu, Yang; Lu, Heng; Zhao, Zhou; Xue, Wei

Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge

Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS 2025), 2025,
CAI, Yiyang; Jiang, Zhengkai; LIU, Yulong; JIANG, Chunyang; XUE, Wei; GUO, Yike; LUO, Wenhan

Graceful Forgetting in Generative Language Models

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing / edited by Christodoulopoulos Christos; Chakraborty Tanmoy; Rose Carolyn; Peng Violet. Association for Computational Linguistics (ACL), 2025, p. 13165-13180
Jiang, Chunyang; Chan, Chi-min; Cai, Yiyang; Liu, Yulong; Xue, Wei; Guo, Yike

Importance Weighting Can Help Large Language Models Self-Improve

Proceedings of the AAAI Conference on Artificial Intelligence, v. 39, (23), p. 24257-24265
Jiang, Chunyang; Chan, Chi Min; Xue, Wei; Liu, Qifeng; Guo, Yike

LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics / edited by Che Wanxiang; Nabende Joyce; Shutova Ekaterina; Pilehvar Mohammad Taher. Association for Computational Linguistics (ACL), 2025, p. 13292-13305
Kang, Boyi; Zhu, Xinfa; Zhang, Zihan; Ye, Zhen; Liu, Mingshuai; Wang, Ziqian; Zhu, Yike; Ma, Guobin; Chen, Jun; Xiao, Longshuai; Weng, Chao; Xue, Wei; Xie, Lei

MelodyEdit: Zero-shot Music Editing with Disentangled Inversion Control

MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025, Association for Computing Machinery, Inc, 2025, p. 10083-10092
Liu, Huadai; Wang, Jialei; Li, Xiangtai; Wang, Wen; Chen, Qian; Huang, Rongjie; Liu, Yang; Xu, Jiayang; Zhao, Zhou; Xue, Wei

MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition

Proceedings of Machine Learning Research, v. 267, p. 35209-35230
Li, Wei; Li, Lujun; Gu, Hao; Huang, You Liang; Lee, Mark; Sun, Shengjie; Xue, Wei; Guo, Yike

MUPT: A GENERATIVE SYMBOLIC MUSIC PRE-TRAINED TRANSFORMER

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 44591-44617
Qu, Xingwei; Bai, Yuelin; Ma, Yinghao; Zhou, Ziya; Lo, Ka Man; Liu, Jiaheng; Yuan, Ruibin; Min, Lejun; Liu, Xueling; Zhang, Tianyu; Du, Xinrun; Guo, Shuyue; Liang, Yiming; Li, Yizhi; Wu, Shangda; Zhou, Junting; Zheng, Tianyu; Ma, Ziyang; Han, Fengze; Xue, Wei; Xia, Gus; Benetos, Emmanouil; Yue, Xiang; Lin, Chenghua; Tan, Xu; Huang, Wenhao; Fu, Jie; Zhang, Ge

OmniAudio: Generating Spatial Audio from 360-Degree Video

Proceedings of Machine Learning Research, v. 267, p. 39060-39084
Liu, Huadai; Luo, Tianyi; Luo, Kaicheng; Jiang, Qikai; Sun, Peiwen; Wang, Jialei; Huang, Rongjie; Chen, Qian; Wang, Wen; Li, Xiangtai; Zhang, Shiliang; Yan, Zhijie; Zhao, Zhou; Xue, Wei

PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2025, article number 11094961
LI, Peng; Zheng, Wangguandong; LIU, Yuan; Yu, Tao; Li, Yangguang; QI, Xingqun; CHI, Xiaowei; Xia, Siyu; Cao, Yanpei; XUE, Wei; LUO, Wenhan; GUO, Yike

STBLLM: BREAKING THE 1-BIT BARRIER WITH STRUCTURED BINARY LLMS

13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, 2025, p. 12014-12039
Dong, Peijie; Li, Lujun; Zhong, Yuedong; Du, Dayou; Fan, Ruibo; Chen, Yuhan; Tang, Zhenheng; Wang, Qiang; Xue, Wei; Guo, Yike; Chu, Xiaowen

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Advances in Neural Information Processing Systems, 2025,
Liu, Huadai; Luo, Kaicheng; Wang, Jialei; Wang, Wen; Chen, Qian; Zhao, Zhou; Xue, Wei

Vidmuse: A simple video-to-music generation framework with long-short-term modeling

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers Inc., 2025, p. 18782-18793article number 11094002
TIAN, Zeyue; LIU, Zhaoyang; YUAN, Ruibin; PAN, Jiahao; LIU, Qifeng; TAN, Xu; CHEN, Qifen; XUE, Wei; GUO, Yike

Article 1

Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech

IEEE Transactions on Multimedia, v. 26, p. 4480-4489
Qian, Xinyuan; Xue, Wei; Zhang, Qiquan; Tao, Ruijie; Li, Haizhou

Conference paper 15

Can LLMs" Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation

Paper presented at International Society for Music Information Retrieval Conference (ISMIR 2024)
Wu, Yuhang; Zhang, Xinyue; Benetos, Emmanouil; Ma, Yinghao; Wang, LU; Wu, Zhiyue; Guo, Yike; Xue, Wei; Yuan, Ruibin; Zhou, Ziya

CHATEVAL: TOWARDS BETTER LLM-BASED EVALUATORS THROUGH MULTI-AGENT DEBATE

Paper presented at 12th International Conference on Learning Representations, ICLR 2024, Hybrid, Vienna, Austria
Chan, Chi Min; Chen, Weize; Su, Yusheng; Yu, Jianxuan; Xue, Wei; Zhang, Shanghang; Fu, Jie; Liu, Zhiyuan

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper presented at 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
Benetos, Emmanouil; Chen, Wenhu; Huang, Wenhao; Jiang, Tao; Kang, Shiyin; Li, Yizhi; Li, Pengfei; Liang, Yiming; Lin, Chenghua; Liu, Cong; Liu, Qin; Liu, Ruibo; Ma, Ziyang; Ma, Yinghao; Shen, Tianhao; Wang, Ziyu; Wang, Zili; Wu, Yuhang; Wu, Jingcheng; Xia, Gus; Xue, Liumeng; Zhang, Ge; Zheng, Tianyu; Lin, Hanfeng; Wu, Shangda; Dannerberg, Roger; Wang, Yi; Liu, Qifeng; Chi, Xiaowei; Fu, Jie; Guo, Yike; Tian, Zeyue; Xue, Wei; Yuan, Ruibin; Zhou, Ziya

ChatMusician: Understanding and Generating Music Intrinsically with LLMs

The 62nd Annual Meeting of the Association for Computational Linguistics / edited by Ku Lun-Wei; Martins Andre; Srikumar Vivek. Association for Computational Linguistics (ACL), 2024, p. 6252-6271
Multimodal Art Projection Research Community; Yuan, Ruibin; Lin, Hanfeng; Wang, Yi; Tian, Zeyue; Wu, Shangda; Shen, Tianhao; Zhang, Ge; Wu, Yuhang; Liu, Cong; Zhou, Ziya; Xue, Liumeng; Ma, Ziyang; Liu, Qin; Zheng, Tianyu; Li, Yizhi; Ma, Yinghao; Liang, Yiming; Chi, Xiaowei; Liu, Ruibo; Wang, Zili; Lin, Chenghua; Liu, Qifeng; Jiang, Tao; Huang, Wenhao; Chen, Wenhu; Fu, Jie; Benetos, Emmanouil; Xia, Gus; Dannenberg, Roger; Xue, Wei; Kang, Shiyin; Guo, Yike

COMOSVC: Consistency Model-based Singing Voice Conversion

2024 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 / edited by Qian Yanmin; Jin Qin; Ou Zhijian; Ling Zhenhua; Wu Zhiyong; Li Ya; Xie Lei; Tao Jianhua. Institute of Electrical and Electronics Engineers Inc., 2024, p. 184-188
Lu, Yiwen; Ye, Zhen; Xue, Wei; Tan, Xu; Liu, Qifeng; Guo, Yike

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

Paper presented at International Society for Music Information Retrieval Conference (ISMIR 2024)
Zhang, Ge; Li, Yizhi; Lin, Chenghua; Xia, Guangyu; Huang, Yipeng; Lin, Hanfeng; Wang, Yi; Benetos, Emmanouil; Ma, Yinghao; Yang, Qikai; Deng, Qixin; Liu, Xubo; Wang, Wenwu; Fu, Jie; Guo, Yike; Pan, Jiahao; Tian, Zeyue; Xue, Wei; Yuan, Ruibin

Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models

Advances in Neural Information Processing Systems, v. 37
Li, Lujun; Dong, Peijie; Tang, Zhenheng; Liu, Xiang; Wang, Qiang; Luo, Wenhan; Xue, Wei; Liu, Qifeng; Chu, Xiaowen; Guo, Yike

FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation

Paper presented at 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)
Liu, Qifeng; Tan, Xu; Chen, Jianyi; Guo, Yike; Xue, Wei; Ye, Zhen

FlashSpeech: Efficient Zero-Shot Speech Synthesis

MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia, Association for Computing Machinery, Inc, 2024, p. 6998-7007
Ye, Zhen; Ju, Zeqian; Liu, Haohe; Tan, Xu; Chen, Jianyi; Lu, Yiwen; Sun, Peiwen; Pan, Jiahao; Bian, Weizhen; He, Shulin; Xue, Wei; Liu, Qifeng; Guo, Yike

FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection

Proceedings of the AAAI Conference on Artificial Intelligence, v. 38, (15), p. 16723-16731
Zhang, Dongmei; Li, Chang; Zhang, Renrui; Xie, Shenghao; Xue, Wei; Xie, Xiaodong; Zhang, Shanghang

Generated Therapeutic Music Based on the ISO Principle

Music Intelligence - 2nd Summit, SOMI 2023, Revised Selected Papers / edited by Li Xiaobing; Guan Xiaohong; Tie Yun; Zhang Xinran; Zhou Qingwen. Springer Science and Business Media Deutschland GmbH, 2024, p. 32-45
Qiu, Zipeng; Yuan, Ruibin; Xue, Wei; Jin, Yucheng

PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain

EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024 / edited by Al-Onaizan Yaser; Bansal Mohit; Chen Yun-Nung. Association for Computational Linguistics (ACL), 2024, p. 4253-4263
Chen, Jianyi; Dai, Zheqi; Ye, Zhen; Tan, Xu; Liu, Qifeng; Guo, Yike; Xue, Wei

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

Paper presented at Conference on Language Modeling (COLM 2024)
Xu, Chunpu; Luo, Hongyin; Chan, Chi-min; Fu, Jie; Guo, Yike; Xue, Wei; Yuan, Ruibin

VIDA: HOMEOSTATIC VISUAL DOMAIN ADAPTER FOR CONTINUAL TEST TIME ADAPTATION

Paper presented at 12th International Conference on Learning Representations, ICLR 2024, Hybrid, Vienna, Austria
Liu, Jiaming; Yang, Senqiao; Jia, Peidong; Zhang, Renrui; Lu, Ming; Guo, Yandong; Xue, Wei; Zhang, Shanghang

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

Paper presented at The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024
Liu, Qifeng; Zhang, Shanghang; Chi, Xiaowei; Guo, Yike; Li, Peng; Li, Mengfei; Luo, Wenhan; Pan, Jiahao; Qi, Xingqun; Xue, Wei; Yuan, Ruibin

Conference paper 5

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, Association for Computing Machinery, Inc, 2023, p. 1831-1839
Ye, Zhen; Xue, Wei; Tan, Xu; Chen, Jie; Liu, Qifeng; Guo, Yike

GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023,
Li, Guanjun; Xue, Wei; Liu, Wenju; Yi, Jiangyan; Tao, Jianhua

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023 / edited by Oh A.; Neumann T.; Globerson A.; Saenko K.; Hardt M.; Levine S.. Neural information processing systems foundation, 2023,
Yuan, Ruibin; Ma, Yinghao; Li, Yizhi; Zhang, Ge; Chen, Xingran; Yin, Hanzhi; Zhuo, Le; Liu, Yiqi; Huang, Jiawen; Tian, Zeyue; Deng, Binyue; Wang, Ningzhi; Lin, Chenghua; Benetos, Emmanouil; Ragni, Anton; Gyenge, Norbert; Dannenberg, Roger; Chen, Wenhu; Xia, Gus; Xue, Wei; Liu, Si; Wang, Shi; Liu, Ruibo; Guo, Yike; Fu, Jie

MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System

Paper presented at 13th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 2023)
Bian, Weizhen; Chan, Tin Yan; Gu, Nianzhen; Li, Tsun Sun; Lo, Tsz To; Song, Yijin; Trillo, Roberto Alonso; Wong, King Chak; Xue, Wei

NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis Based on Frequency Modulation

Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 / edited by Elkind Edith. International Joint Conferences on Artificial Intelligence, 2023, p. 5869-5877
Ye, Zhen; Xue, Wei; Tan, Xu; Liu, Qifeng; Guo, Yike

Article 2

Deep Audio-Visual Beamforming for Speaker Localization

IEEE Signal Processing Letters, v. 29, p. 1132-1136
Qian, Xinyuan; Zhang, Qiquan; Guan, Guohui; Xue, Wei

Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement

Trends in Hearing, v. 26
Green, Tim; Hilkhuysen, Gaston; Huckvale, Mark; Rosen, Stuart; Brookes, Mike; Moore, Alastair; Naylor, Patrick; Lightburn, Leo; Xue, Wei

Article 1

Speech Enhancement Based on Modulation-Domain Parametric Multichannel Kalman Filtering

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 29, p. 393-405, article number 9272832
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.

Conference paper 2

Causal System Identification based Compensation for Reverberation-Robust DOA Estimation

29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings, European Signal Processing Conference, EUSIPCO, 2021, p. 1885-1889
He, Li; Xue, Wei

Neural Kalman filtering for speech enhancement

2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2021, p. 7108-7112
Xue, Wei; Quan, Gang; Zhang, Chao; Ding, Guohong; He, Xiaodong; Zhou, Bowen

Article 1

Frame-GAN: Increasing the frame rate of gait videos with generative adversarial networks

Neurocomputing, v. 380, p. 95-104
Xue, Wei; Ai, Hong; Sun, Tianyu; Song, Chunfeng; Huang, Yan; Wang, Liang

Conference paper 3

SkipConvNet: Skip convolutional neural network for speech dereverberation using optimally smoothed spectral mapping

Interspeech 2020, International Speech Communication Association, 2020, p. 3935-3939
Kothapally, Vinay; Xia, Wei; Ghorbani, Shahram; Hansen, John H.L.; Xue, Wei; Huang, Jing

Sound event localization and detection based on multiple DOA beamforming and multi-task learning

Interspeech 2020, International Speech Communication Association, 2020, p. 5091-5095
Xue, Wei; Tong, Ying; Zhang, Chao; Ding, Guohong; He, Xiaodong; Zhou, Bowen

The JD AI speaker verification system for the FFSVC 2020 challenge

Interspeech 2020, International Speech Communication Association, 2020, p. 3476-3480
Tong, Ying; Xue, Wei; Huang, Shanluo; Fan, Lu; Zhang, Chao; Ding, Guohong; He, Xiaodong

Article 1

Noise covariance matrix estimation for rotating microphone arrays

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 27, (3), p. 519-530, article number 8540424
Moore, Alastair H.; Xue, Wei; Naylor, Patrick A.; Brookes, Mike

Conference paper 1

Direct-path signal cross-correlation estimation for sound source localization in reverberation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2019-September, p. 2693-2697
Xue, Wei; Tong, Ying; Ding, Guohong; Zhang, Chao; Ma, Tao; He, Xiaodong; Zhou, Bowen

Article 1

Modulation-domain multichannel kalman filtering for speech enhancement

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 26, (10), p. 1833-1847
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.

Conference paper 4

Binaural mask-informed speech enhancement for hearing AIDS with head tracking

16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2018, p. 461-465article number 8521361
Moore, Alastair H.; Lightburn, Leo; Xue, Wei; Naylor, Patrick A.; Brookes, Mike

Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays

Conference Record of the 52nd Asilomar Conference on Signals, Systems and Computers, ACSSC 2018 / edited by Matthews Michael B.. IEEE Computer Society, 2018, p. 1936-1941article number 8645397
Moore, Alastair H.; Xue, Wei; Naylor, Patrick A.; Brookes, Mike

Modulation-domain parametric multichannel Kalman filtering for speech enhancement

2018 26th European Signal Processing Conference, EUSIPCO 2018, European Signal Processing Conference, EUSIPCO, 2018, p. 2509-2513article number 8552954
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.

Multichannel Kalman Filtering for Speech Ehnancement

2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2018, p. 41-45article number 8461903
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.

Conference paper 3

Frequency-domain under-modelled blind system identification based on cross power spectrum and sparsity regularization

2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2017, p. 591-595article number 7952224
Xue, Wei; Brookes, Mike; Naylor, Patrick A.

Long short-term memory recurrent neural network based segment features for music genre classification

Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016 / edited by Wang Hsin-Min; Hou Qingzhi; Wei Yuan; Lee Tan; Wei Jianguo; Xie Lei; Feng Hui; Dang Jianwu; Dang Jianwu. Institute of Electrical and Electronics Engineers Inc., 2017, article number 7918369
Dai, Jia; Liang, Shan; Xue, Wei; Ni, Chongjia; Liu, Wenju

Multilingual I-vector based statistical modeling for music genre classification

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2017-August, p. 459-463
Dai, Jia; Xue, Wei; Liu, Wenju

Conference paper 4

A novel codebook representation method and encoding strategy for bag-of-words based acoustic event classification

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015, Institute of Electrical and Electronics Engineers Inc., 2016, p. 31-34article number 7415326
Dai, Jia; Ni, Chongjia; Xue, Wei; Liu, Wenju

Cross-correlation based under-modelled multichannel blind acoustic system identification with sparsity regularization

2016 24th European Signal Processing Conference, EUSIPCO 2016, European Signal Processing Conference, EUSIPCO, 2016, p. 718-722article number 7760342
Xue, Wei; Brookes, Mike; Naylor, Patrick A.

Semi-supervised learning of bottleneck feature for music genre classification

Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings / edited by Tan Tieniu; Chen Xilin; Li Xuelong; Yang Jian; Cheng Hong; Zhou Jie. Springer Verlag, 2016, p. 552-562
Dai, Jia; Liu, Wenju; Zheng, Hao; Xue, Wei; Ni, Chongjia

Under-modelled blind system identification for time delay estimation in reverberant environments

2016 International Workshop on Acoustic Signal Enhancement, IWAENC 2016, Institute of Electrical and Electronics Engineers Inc., 2016, article number 7602923
Xue, Wei; Brookes, Mike; Naylor, Patrick A.

Article 1

Noise Robust Direction of Arrival Estimation for Speech Source with Weighted Bispectrum Spatial Correlation Matrix

IEEE Journal on Selected Topics in Signal Processing, v. 9, (5), p. 837-851, article number 7067389
Xue, Wei; Liu, Wenju; Liang, Shan

Conference paper 2

Joint optimization of recurrent networks exploiting source auto-regression for source separation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2015-January, p. 3307-3311
Nie, Shuai; Xue, Wei; Liang, Shan; Zhang, Xueliang; Liu, Wenju; Qiao, Liwei; Li, Jianping

Two-stage multi-target joint learning for monaural speech separation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2015-January, p. 1503-1507
Nie, Shuai; Liang, Shan; Xue, Wei; Zhang, Xueliang; Liu, Wenju; Dong, Like; Yang, Hong

Article 1

The analysis of the simplification from the ideal ratio to binary mask in signal-to-noise ratio sense

Speech Communication, v. 59, p. 22-30
Liang, Shan; Liu, Wenju; Jiang, Wei; Xue, Wei

Conference paper 2

DOA estimation of speech source in noisy environments with weighted spatial bispectrum correlation matrix

2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Institute of Electrical and Electronics Engineers Inc., 2014, p. 2282-2286article number 6854006
Xue, Wei; Liang, Shan; Liu, Wenju

Weighted spatial bispectrum correlation matrix for DOA estimation in the presence of interferences

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, p. 2228-2232
Xue, Wei; Liang, Shan; Liu, Wenju

Article 1

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio

Journal of the Acoustical Society of America, v. 134, (5), p. EL452-EL458
Liang, Shan; Liu, Wenju; Jiang, Wei; Xue, Wei

Conference paper 1

Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, p. 2895-2899
Xue, Wei; Liang, Shan; Liu, Wenju

Conference paper 1

Direction of arrival estimation based on subband weighting for noisy conditions

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, International Speech Communication Association, 2012, p. 142-145
Xue, Wei; Liu, Wenju

2020 4

Frame-GAN: Increasing the frame rate of gait videos with generative adversarial networks

Neurocomputing, v. 380, p. 95-104
Xue, Wei; Ai, Hong; Sun, Tianyu; Song, Chunfeng; Huang, Yan; Wang, Liang
Article

SkipConvNet: Skip convolutional neural network for speech dereverberation using optimally smoothed spectral mapping

Interspeech 2020, International Speech Communication Association, 2020, p. 3935-3939
Kothapally, Vinay; Xia, Wei; Ghorbani, Shahram; Hansen, John H.L.; Xue, Wei; Huang, Jing
Conference paper

Sound event localization and detection based on multiple DOA beamforming and multi-task learning

Interspeech 2020, International Speech Communication Association, 2020, p. 5091-5095
Xue, Wei; Tong, Ying; Zhang, Chao; Ding, Guohong; He, Xiaodong; Zhou, Bowen
Conference paper

The JD AI speaker verification system for the FFSVC 2020 challenge

Interspeech 2020, International Speech Communication Association, 2020, p. 3476-3480
Tong, Ying; Xue, Wei; Huang, Shanluo; Fan, Lu; Zhang, Chao; Ding, Guohong; He, Xiaodong
Conference paper

2019 2

Noise covariance matrix estimation for rotating microphone arrays

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 27, (3), p. 519-530, article number 8540424
Moore, Alastair H.; Xue, Wei; Naylor, Patrick A.; Brookes, Mike
Article

Direct-path signal cross-correlation estimation for sound source localization in reverberation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2019-September, p. 2693-2697
Xue, Wei; Tong, Ying; Ding, Guohong; Zhang, Chao; Ma, Tao; He, Xiaodong; Zhou, Bowen
Conference paper

2018 5

Modulation-domain multichannel kalman filtering for speech enhancement

IEEE/ACM Transactions on Audio Speech and Language Processing, v. 26, (10), p. 1833-1847
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Article

Binaural mask-informed speech enhancement for hearing AIDS with head tracking

16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2018, p. 461-465article number 8521361
Moore, Alastair H.; Lightburn, Leo; Xue, Wei; Naylor, Patrick A.; Brookes, Mike
Conference paper

Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays

Conference Record of the 52nd Asilomar Conference on Signals, Systems and Computers, ACSSC 2018 / edited by Matthews Michael B.. IEEE Computer Society, 2018, p. 1936-1941article number 8645397
Moore, Alastair H.; Xue, Wei; Naylor, Patrick A.; Brookes, Mike
Conference paper

Modulation-domain parametric multichannel Kalman filtering for speech enhancement

2018 26th European Signal Processing Conference, EUSIPCO 2018, European Signal Processing Conference, EUSIPCO, 2018, p. 2509-2513article number 8552954
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Conference paper

Multichannel Kalman Filtering for Speech Ehnancement

2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2018, p. 41-45article number 8461903
Xue, Wei; Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Conference paper

2017 3

Frequency-domain under-modelled blind system identification based on cross power spectrum and sparsity regularization

2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2017, p. 591-595article number 7952224
Xue, Wei; Brookes, Mike; Naylor, Patrick A.
Conference paper

Long short-term memory recurrent neural network based segment features for music genre classification

Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016 / edited by Wang Hsin-Min; Hou Qingzhi; Wei Yuan; Lee Tan; Wei Jianguo; Xie Lei; Feng Hui; Dang Jianwu; Dang Jianwu. Institute of Electrical and Electronics Engineers Inc., 2017, article number 7918369
Dai, Jia; Liang, Shan; Xue, Wei; Ni, Chongjia; Liu, Wenju
Conference paper

Multilingual I-vector based statistical modeling for music genre classification

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2017-August, p. 459-463
Dai, Jia; Xue, Wei; Liu, Wenju
Conference paper

2016 4

A novel codebook representation method and encoding strategy for bag-of-words based acoustic event classification

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2015, Institute of Electrical and Electronics Engineers Inc., 2016, p. 31-34article number 7415326
Dai, Jia; Ni, Chongjia; Xue, Wei; Liu, Wenju
Conference paper

Cross-correlation based under-modelled multichannel blind acoustic system identification with sparsity regularization

2016 24th European Signal Processing Conference, EUSIPCO 2016, European Signal Processing Conference, EUSIPCO, 2016, p. 718-722article number 7760342
Xue, Wei; Brookes, Mike; Naylor, Patrick A.
Conference paper

Semi-supervised learning of bottleneck feature for music genre classification

Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings / edited by Tan Tieniu; Chen Xilin; Li Xuelong; Yang Jian; Cheng Hong; Zhou Jie. Springer Verlag, 2016, p. 552-562
Dai, Jia; Liu, Wenju; Zheng, Hao; Xue, Wei; Ni, Chongjia
Conference paper

Under-modelled blind system identification for time delay estimation in reverberant environments

2016 International Workshop on Acoustic Signal Enhancement, IWAENC 2016, Institute of Electrical and Electronics Engineers Inc., 2016, article number 7602923
Xue, Wei; Brookes, Mike; Naylor, Patrick A.
Conference paper

2015 3

Noise Robust Direction of Arrival Estimation for Speech Source with Weighted Bispectrum Spatial Correlation Matrix

IEEE Journal on Selected Topics in Signal Processing, v. 9, (5), p. 837-851, article number 7067389
Xue, Wei; Liu, Wenju; Liang, Shan
Article

Joint optimization of recurrent networks exploiting source auto-regression for source separation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2015-January, p. 3307-3311
Nie, Shuai; Xue, Wei; Liang, Shan; Zhang, Xueliang; Liu, Wenju; Qiao, Liwei; Li, Jianping
Conference paper

Two-stage multi-target joint learning for monaural speech separation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v. 2015-January, p. 1503-1507
Nie, Shuai; Liang, Shan; Xue, Wei; Zhang, Xueliang; Liu, Wenju; Dong, Like; Yang, Hong
Conference paper

2014 3

The analysis of the simplification from the ideal ratio to binary mask in signal-to-noise ratio sense

Speech Communication, v. 59, p. 22-30
Liang, Shan; Liu, Wenju; Jiang, Wei; Xue, Wei
Article

DOA estimation of speech source in noisy environments with weighted spatial bispectrum correlation matrix

2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Institute of Electrical and Electronics Engineers Inc., 2014, p. 2282-2286article number 6854006
Xue, Wei; Liang, Shan; Liu, Wenju
Conference paper

Weighted spatial bispectrum correlation matrix for DOA estimation in the presence of interferences

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, p. 2228-2232
Xue, Wei; Liang, Shan; Liu, Wenju
Conference paper

2013 2

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio

Journal of the Acoustical Society of America, v. 134, (5), p. EL452-EL458
Liang, Shan; Liu, Wenju; Jiang, Wei; Xue, Wei
Article

Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, p. 2895-2899
Xue, Wei; Liang, Shan; Liu, Wenju
Conference paper

2012 1

Direction of arrival estimation based on subband weighting for noisy conditions

13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, International Speech Communication Association, 2012, p. 142-145
Xue, Wei; Liu, Wenju
Conference paper

Teaching Assignment

AMCC6110

Professional Practice and Research (Internships)

AMCC5220	Technology in Music and Sound Art
AMCC6110	Professional Practice and Research (Internships)
AMCC6900C	Independent Study in Arts and Machine Creativity

AMCC5000	Creative Convergence: Foundations of Arts and Machine Creativity
AMCC5010	Research Methodology in Arts and Machine Creativity
ARIN5202	Machine Learning for Natural Language Processing
EMIA6950F	Independent Study
MAIE5221	Natural Language Processing

EMIA4110	Practical Machine Learning
EMIA6500N	Digital Audio Processing
IIMP6090	Postgraduate Seminar

No Teaching Assignments

Research Interest

Publications

2026 9

2025 28

2024 16

2023 5

2022 2

2021 3

2020 4

2019 2

2018 5

2017 3

2016 4

2015 3

2014 3

2013 2

2012 1

Article 3

Conference paper 6

Article 2

Conference paper 26

Article 1

Conference paper 15

Conference paper 5

Article 2

Article 1

Conference paper 2

Article 1

Conference paper 3

Article 1

Conference paper 1

Article 1

Conference paper 4

Conference paper 3

Conference paper 4

Article 1

Conference paper 2

Article 1

Conference paper 2

Article 1

Conference paper 1

Conference paper 1

2020 4

2019 2

2018 5

2017 3

2016 4

2015 3

2014 3

2013 2

2012 1

Teaching Assignment

Research Postgraduate (RPG) Supervision

From January 2023 to December 2026 (As of 27 June 2026)

Current RPGs

Projects

From January 2024 to December 2026

Your browser is out of date!