DEng in Pattern Recognition and Intelligent Systems
University of Chinese Academy of Sciences, 2015
Enhancing target speaker extraction with Hierarchical Speaker Representation Learning
Article
Article
AttnZero: Efficient Attention Discovery for Vision Transformers
Conference paper
Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search
Conference paper
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
Conference paper
CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation
Conference paper
CO3GESTURE: TOWARDS COHERENT CONCURRENT CO-SPEECH 3D GESTURE GENERATION WITH INTERACTIVE DIFFUSION
Conference paper
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Conference paper
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Conference paper
EVA: An Embodied World Model for Future Video Anticipation
Conference paper
FlashAudio: Rectified Flows for Fast and High-fidelity Text-to-Audio Generation
Conference paper
Importance Weighting Can Help Large Language Models Self-Improve
Conference paper
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
Conference paper
MelodyEdit: Zero-shot Music Editing with Disentangled Inversion Control
Conference paper
MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition
Conference paper
MUPT: A GENERATIVE SYMBOLIC MUSIC PRE-TRAINED TRANSFORMER
Conference paper
OmniAudio: Generating Spatial Audio from 360-Degree Video
Conference paper
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Conference paper
Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech
Article
Can LLMs" Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation
Conference paper
CHATEVAL: TOWARDS BETTER LLM-BASED EVALUATORS THROUGH MULTI-AGENT DEBATE
Conference paper
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Conference paper
ChatMusician: Understanding and Generating Music Intrinsically with LLMs
Conference paper
COMOSVC: Consistency Model-based Singing Voice Conversion
Conference paper
ComposerX: Multi-Agent Symbolic Music Composition with LLMs
Conference paper
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
Conference paper
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
Conference paper
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Conference paper
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
Conference paper
Generated Therapeutic Music Based on the ISO Principle
Conference paper
PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain
Conference paper
RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation
Conference paper
VIDA: HOMEOSTATIC VISUAL DOMAIN ADAPTER FOR CONTINUAL TEST TIME ADAPTATION
Conference paper
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Conference paper
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Conference paper
Conference paper
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
Conference paper
MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System
Conference paper
Conference paper
Speech Enhancement Based on Modulation-Domain Parametric Multichannel Kalman Filtering
Article
Causal System Identification based Compensation for Reverberation-Robust DOA Estimation
Conference paper
Neural Kalman filtering for speech enhancement
Conference paper
Frame-GAN: Increasing the frame rate of gait videos with generative adversarial networks
Article
Conference paper
Sound event localization and detection based on multiple DOA beamforming and multi-task learning
Conference paper
The JD AI speaker verification system for the FFSVC 2020 challenge
Conference paper
Noise covariance matrix estimation for rotating microphone arrays
Article
Direct-path signal cross-correlation estimation for sound source localization in reverberation
Conference paper
Modulation-domain multichannel kalman filtering for speech enhancement
Article
Binaural mask-informed speech enhancement for hearing AIDS with head tracking
Conference paper
Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays
Conference paper
Modulation-domain parametric multichannel Kalman filtering for speech enhancement
Conference paper
Multichannel Kalman Filtering for Speech Ehnancement
Conference paper
Conference paper
Conference paper
Multilingual I-vector based statistical modeling for music genre classification
Conference paper
Conference paper
Conference paper
Semi-supervised learning of bottleneck feature for music genre classification
Conference paper
Under-modelled blind system identification for time delay estimation in reverberant environments
Conference paper
Article
Joint optimization of recurrent networks exploiting source auto-regression for source separation
Conference paper
Two-stage multi-target joint learning for monaural speech separation
Conference paper
Article
Conference paper
Weighted spatial bispectrum correlation matrix for DOA estimation in the presence of interferences
Conference paper
The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio
Article
Conference paper
Direction of arrival estimation based on subband weighting for noisy conditions
Conference paper
AttnZero: Efficient Attention Discovery for Vision Transformers
Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation
CO3GESTURE: TOWARDS COHERENT CONCURRENT CO-SPEECH 3D GESTURE GENERATION WITH INTERACTIVE DIFFUSION
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
FlashAudio: Rectified Flows for Fast and High-fidelity Text-to-Audio Generation
Importance Weighting Can Help Large Language Models Self-Improve
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
MelodyEdit: Zero-shot Music Editing with Disentangled Inversion Control
MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Can LLMs" Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation
CHATEVAL: TOWARDS BETTER LLM-BASED EVALUATORS THROUGH MULTI-AGENT DEBATE
ChatMusician: Understanding and Generating Music Intrinsically with LLM
ChatMusician: Understanding and Generating Music Intrinsically with LLMs
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain
RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation
VIDA: HOMEOSTATIC VISUAL DOMAIN ADAPTER FOR CONTINUAL TEST TIME ADAPTATION
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Frame-GAN: Increasing the frame rate of gait videos with generative adversarial networks
Article
Conference paper
Sound event localization and detection based on multiple DOA beamforming and multi-task learning
Conference paper
The JD AI speaker verification system for the FFSVC 2020 challenge
Conference paper
Noise covariance matrix estimation for rotating microphone arrays
Article
Direct-path signal cross-correlation estimation for sound source localization in reverberation
Conference paper
Modulation-domain multichannel kalman filtering for speech enhancement
Article
Binaural mask-informed speech enhancement for hearing AIDS with head tracking
Conference paper
Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays
Conference paper
Modulation-domain parametric multichannel Kalman filtering for speech enhancement
Conference paper
Multichannel Kalman Filtering for Speech Ehnancement
Conference paper
Conference paper
Conference paper
Multilingual I-vector based statistical modeling for music genre classification
Conference paper
Conference paper
Conference paper
Semi-supervised learning of bottleneck feature for music genre classification
Conference paper
Under-modelled blind system identification for time delay estimation in reverberant environments
Conference paper
Article
Joint optimization of recurrent networks exploiting source auto-regression for source separation
Conference paper
Two-stage multi-target joint learning for monaural speech separation
Conference paper
Article
Conference paper
Weighted spatial bispectrum correlation matrix for DOA estimation in the presence of interferences
Conference paper
The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio
Article
Conference paper
Direction of arrival estimation based on subband weighting for noisy conditions
Conference paper
| AMCC5220 | Technology in Music and Sound Art |
| AMCC6110 | Professional Practice and Research (Internships) |
| AMCC5000 | Creative Convergence: Foundations of Arts and Machine Creativity |
| AMCC5010 | Research Methodology in Arts and Machine Creativity |
| ARIN5202 | Machine Learning for Natural Language Processing |
| EMIA6950F | Independent Study |
| MAIE5221 | Natural Language Processing |
| EMIA4110 | Practical Machine Learning |
| EMIA6500N | Digital Audio Processing |
| IIMP6090 | Postgraduate Seminar |
| EMIA6950N | Independent Study |
| No Teaching Assignments |
| No Teaching Assignments |
BIAN, Weizhen
Individualized Interdisciplinary Program
DENG, Ning
Individualized Interdisciplinary Program
GUO, Haoqiang
(co-supervision)
Individualized Interdisciplinary Program
GUO, Shengyao
(co-supervision)
Individualized Interdisciplinary Program
GUO, Yijie
(co-supervision)
Individualized Interdisciplinary Program
JI, Xinjian
Arts and Machine Creativity
KANG, Boyi
Individualized Interdisciplinary Program
LI, Yiming
Individualized Interdisciplinary Program
LIN, Sida
(co-supervision)
Individualized Interdisciplinary Program
LIU, Huadai
Individualized Interdisciplinary Program
PAN, Jiahao
Individualized Interdisciplinary Program
PENG, Yi
Individualized Interdisciplinary Program
WANG, Lei
Individualized Interdisciplinary Program
WANG, Yatian
(co-supervision)
Individualized Interdisciplinary Program
ZHANG, Zhenyuan
(co-supervision)
Individualized Interdisciplinary Program
CHAN, Chi-min
(co-supervision)
Individualized Interdisciplinary Program
CHENG, Sitong
Individualized Interdisciplinary Program
JIA, Xianzhang
(co-supervision)
Individualized Interdisciplinary Program
JIANG, Chunyang
(co-supervision)
Individualized Interdisciplinary Program
JIN, Yizhu
Individualized Interdisciplinary Program
LIU, Yulong
(co-supervision)
Computer Science and Engineering
LU, Yiwen
(co-supervision)
Individualized Interdisciplinary Program
SIMA, Qie
(co-supervision)
Individualized Interdisciplinary Program
ZHU, Chuanbo
(co-supervision)
Individualized Interdisciplinary Program
CHEN, Jianyi
Individualized Interdisciplinary Program
TIAN, Zeyue
(co-supervision)
Individualized Interdisciplinary Program
YE, Zhen
Individualized Interdisciplinary Program
YUAN, Ruibin
(co-supervision)
Individualized Interdisciplinary Program
ZHOU, Ziya
(co-supervision)
Individualized Interdisciplinary Program
CHEN, Pengyu
(co-supervision)
Individualized Interdisciplinary Program
Update your browser to view this website correctly. Update your browser now