Style-Friendly SNR Sampler for Style-Driven Generation
Jooyoung Choi*, Chaehun Shin*, Yeongtak Oh, Heeseung Kim, Jungbeom Lee, and Sungroh Yoon, in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, USA, March 2026.
Beyond Language-Specific Neurons: The Challenge of Identifying Speech-Specific Neurons in Multimodal LLMs
Nohil Park, Che Hyun Lee, Jiheum Yeom, Heeseung Kim, and Sungroh Yoon, in IEEE Journal of Selected Topics in Signal Processing, in press.
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Heeseung Kim*, Che Hyun Lee*, Sangkwon Park, Jiheum Yeom, Nohil Park, Sangwon Yu, and Sungroh Yoon, in Findings of the Association for Computational Linguistics (ACL Findings), Vienna, Austria, July 2025.
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
Che Hyun Lee, Heeseung Kim, Jiheum Yeom, and Sungroh Yoon, in Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Vienna, Austria, July 2025.
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Chaehun Shin, Jooyoung Choi, Heeseung Kim, and Sungroh Yoon, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, June 2025.
NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers
Nohil Park, Heeseung Kim, Che Hyun Lee, Jooyoung Choi, Jiheum Yeom, and Sungroh Yoon, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, April 2025. (Oral)
VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance
Jiheum Yeom, Heeseung Kim, Jooyoung Choi, Che Hyun Lee, Nohil Park, and Sungroh Yoon, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, April 2025.
Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation
Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Soyoon Kim, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Jung-Woo Ha, Sungroh Yoon, and Kang Min Yoo, in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, Canada, December 2024.
project arXiv code blog article
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech
Heeseung Kim, Sang-gil Lee, Jiheum Yeom, Che Hyun Lee, Sungwon Kim, and Sungroh Yoon, in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), Kos, Greece, September 2024.
HyperCLOVA X Technical Report
HyperCLOVA X Team, NAVER Cloud, arXiv preprint, 2024.
UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
Heeseung Kim, Sungwon Kim, Jiheum Yeom, and Sungroh Yoon, in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, August 2023. (Oral)
Edit-A-Video: Single Video Editing with Object-Aware Consistency
Chaehun Shin*, Heeseung Kim*, Che Hyun Lee, Sang-gil Lee, and Sungroh Yoon, in Proceedings of the Asian Conference on Machine Learning (ACML), Istanbul, Turkey, November 2023. (Oral, Best Paper Award)
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim*, Sungwon Kim*, and Sungroh Yoon, in Proceedings of the International Conference on Machine Learning (ICML), Baltimore, USA, July 2022.
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
Sang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng, Tao Qin, Wei Chen, Sungroh Yoon, and Tie-Yan Liu, in Proceedings of the International Conference on Learning Representations (ICLR), Virtual, April 2022.
Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings
Sangwon Yu, Jongyoon Song, Heeseung Kim, Seong-min Lee, Woo-Jong Ryu, and Sungroh Yoon, in Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Dublin, Ireland, May 2022.
Stein Latent Optimization for Generative Adversarial Networks
Uiwon Hwang, Heeseung Kim, Dahuin Jung, Hyemi Jang, Hyungyu Lee, and Sungroh Yoon, in Proceedings of the International Conference on Learning Representations (ICLR), Virtual, April 2022.
Silent Speech Recognition with Strain Sensors and Deep Learning Analysis of Directional Facial Muscle Movement
Hyunjun Yoo*, Eunji Kim*, Jong Won Chung*, Hyeon Cho, Sujin Jeong, Heeseung Kim, Dongju Jang, Hayun Kim, Jinsu Yoon, Gae Hwang Lee, Hyunbum Kang, Joo-Young Kim, Youngjun Yun, Sungroh Yoon, and Yongtaek Hong, ACS Applied Materials & Interfaces, 2022.
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim*, Heeseung Kim*, and Sungroh Yoon, arXiv preprint, 2022.