publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2024

  1. xue2024improving.png
    Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model
    Jinlong Xue, Yayue Deng, Yicheng Han, and 2 more authors
    Interspeech, 2024
  2. xue2024retrieval.png
    Retrieval Augmented Generation in Prompt-based Text-to-Speech Synthesis with Context-Aware Contrastive Language-Audio Pretraining
    Jinlong Xue, Yayue Deng, Yingming Gao, and 1 more author
    Interspeech, 2024
  3. xue2024auffusion.png
    Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation
    Jinlong Xue, Yayue Deng, Yingming Gao, and 1 more author
    arXiv preprint arXiv:2401.01044, 2024
  4. xue2023concss.png
    CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis
    Yayue Deng, Jinlong Xue, Yukang Jia, and 6 more authors
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024

2023

  1. xue2023cmcu.png
    CMCU-CSS: Enhancing Naturalness via Commonsense-based Multi-modal Context Understanding in Conversational Speech Synthesis
    Yayue Deng, Jinlong Xue, Yingming Gao, and 1 more author
    In Proceedings of the 31st ACM International Conference on Multimedia, (MM), 2023
  2. xue2023m.png
    M2-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis
    Jinlong Xue, Yayue Deng, Fengping Wang, and 5 more authors
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023

2022

  1. xue2022ecapa.png
    ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
    Jinlong Xue, Yayue Deng, Yichen Han, and 3 more authors
    In 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
  2. ke2022rhythm.png
    Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis
    Dengfeng Ke, Yayue Deng, Yukang Jia, and 6 more authors
    In 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
  3. han2022keypoint.png
    A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis
    Yichen Han, Ya Li, Yingming Gao, and 3 more authors
    In 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), 2022