|
Hyesung Jeon
I am a Ph.D. student in Electrical and Computer Engineering at
Seoul National University,
advised by Prof. Jae-Joon Kim.
My research focuses on efficient algorithms and serving systems for generative models and AI agents,
spanning model compression, PEFT, and efficient attention mechanisms.
I am open to discussions about research collaborations and job opportunities :)
Email   |  
CV   |  
LinkedIn   |  
Github   |  
Scholar
|
|
|
Research Interests
Keywords: Efficient AI Serving, Model Compression, PEFT, KV Cache Compression, Agentic AI Systems
My primary research focus is building efficient algorithms and serving systems for generative models,
such as large language models, large multi-modal models, and diffusion models. My work mainly spans
the post-training stack, from quantization-aware fine-tuning to KV cache management of agentic LLMs.
I am drawn to the emerging efficiency challenges with long-context inference and agentic workloads.
In particular, my research interests lie in:
- Model compression (quantization and pruning) and parameter-efficient fine-tuning
- KV cache compression and efficient attention mechanisms
- Efficient inference systems for LLM-based agents
- Hardware-software co-design for low-precision training and inference
|
Education
Seoul National University
Ph.D. Student in Electrical and Computer Engineering
Advisor: Prof. Jae-Joon Kim
|
Mar. 2023 –
Present |
Seoul National University
B.S. in Electrical and Computer Engineering
Summa Cum Laude (GPA 4.04/4.30)
|
Mar. 2019 – Feb.
2023 |
|
|
Publications
* Equal contribution
|
|
|
[7] LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents
Hyesung Jeon,
Hyeongju Ha,
Jae-Joon Kim
ICML, July 2026
Paper |
Code |
Bibtex
|
|
|
[6] QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient
Fine-Tuning on Large Language Models
Hyesung Jeon*,
Seojune Lee*,
Beomseok Kang,
Yulhwa Kim,
Jae-Joon Kim
ICLR, April 2026
Paper |
Code |
Bibtex
|
|
|
[5] PRESTE: Preserving Tiny Exponent Precision for Efficient Sub-8-bit LLM Inference
and Fine-Tuning
Jehun Lee,
Hyesung Jeon,
Juchan Lee,
Jae-Joon Kim
Preprint, Jan. 2026
Paper |
Bibtex
|
|
|
[4] L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language
Models
Hyesung Jeon,
Yulhwa Kim,
Jae-Joon Kim
ACL, Jul. 2025
Paper |
Bibtex
|
|
|
[3] Leveraging Early-Stage Robustness in Diffusion Models for Efficient and
High-Quality Image Synthesis
Yulhwa Kim,
Dongwon Jo,
Hyesung Jeon,
Taesu Kim,
Daehyun Ahn,
Hyungjun Kim,
Jae-Joon Kim
NeurIPS, Dec. 2023
Paper |
Bibtex
|
|
|
[2] Squeezing Large-Scaling Diffusion Models for Mobile
Jiwoong Choi,
Minkyu Kim,
Daehyun Ahn,
Taesu Kim,
Yulhwa Kim,
Dongwon Jo,
Hyesung Jeon,
Jae-Joon Kim,
Hyungjun Kim
ICML Workshop on Challenges of Deploying Generative AI, Jul. 2023
Paper |
Bibtex
|
|
|
[1] AND-Net Based Multi Precision Neural Network Accelerator Design
Hyesung Jeon,
Jae-Joon Kim
B.S. Graduate Thesis (in Korean), Feb. 2023
Paper |
Bibtex
|
|
|
MangoBoost (Internship)
RDMA System Architecture Design for Data Processing Unit
Mentor: Prof. Jangwoo Kim
Apr. 2022 – Jun. 2022
|
|
|
SK Hynix Solution Center (Internship)
Deep Learning Network Design on SoC-NAND Validation
The Encouragement Prize in the Internship Workshop
Mentor: Dr. Yong Lee
Jun. 2021 – Aug. 2021
Slides
|
|
|
Seoul National University, Grad. School of Convergence Science and Technology (Student Researcher)
Deep Learning Network Architecture Design for Nano-optical Layer
Mentor: Prof. Changsoon Kim
Jan. 2021 – Feb. 2021
Code |
Slides
|
|
|
Seoul National University, Dept. of System Semiconductor Engineering for AI (Student Researcher)
Development of English Subtitle Generation Server
Development of Visitor Counter Application with Face Detection Model
Mentor: Dr. Seokkyu Kwon
Sep. 2020 – Feb. 2021
Code (Video Translation) |
Code (Face Counting) |
Slides
|
Honors & Awards
Qualcomm Innovation Fellowship Korea, Winner
L4Q: Parameter Efficient Quantization-Aware
Fine-Tuning on Large Language Models
|
2025 |
Samsung SAIT Computer Engineering Challenge, 3rd Prize
vLLM-based LLM Inference Acceleration on Multi-GPU Systems
|
2023 |
SNU Social Responsibility+ Competition, The Encouragement Prize
Arduino-based Refrigerator Storage Assistance Agentic System Design
|
2020 |
|
SNU Electrical Circuit Design Workshop, The Excellence Prize
|
2020 |
|
Korea Presidential Science Scholarship, Full Tuition Scholarship
|
2019 – 2023 |
|
Intel International Science and Engineering Fair, Finalist
|
2018 |
|
Skills
| Language |
English, Korean |
| Programming Language |
Python, C/C++ |
| Deep Learning Frameworks |
PyTorch, vLLM, SGLang, CUDA, Triton, Verilog, HLS |
|
Academic Services
Invited Talks
- Efficient AI Meetup Korea (2024, 2026)
Common Features of KV Cache Compression in LLMs
Reviewer & Committee
Leadership & Volunteering
- SNU Tomorrow's Engineers Membership (STEM), Chairman (2021 – 2023)
Organized and participated undergraduate mentorship and scholarship programs.
|
|