Yushi Huang 黄雨石

Ph.D. Student at HKUST · Efficient generative models

Hong Kong SAR

Yushi Huang

I am a Ph.D. student at the Hong Kong University of Science and Technology (HKUST), supervised by Prof. Jun Zhang. I received my B.E. degree from Beihang University. My research interest is building efficient and high-performing generative systems. I currently work on RL for efficient image/video generation. Previously, I worked on improving inference efficiency for vision and language generative models, including low-precision inference, computation skipping, efficient attention, etc.

I am always happy to chat about research and potential collaborations — feel free to reach out.

News

  • 2026.05:  🎉🎉 Our Light Forcing, SGMD, and Flash-VAED are accepted to ICML.
  • 2026.04:  🎉🎉 Our LinVideo is selected as a Highlight Poster.
  • 2026.04:  🎉🎉 Our Focus-dLLM is accepted to ACL (Main).
  • 2026.02:  🎉🎉 Our MoDES and LinVideo are accepted to CVPR.
  • 2026.01:  🎉🎉 Our QVGen is accepted to ICLR.
  • 2025.11:  🎉🎉 Our SlimInfer and LLMC+ are accepted to AAAI.
  • 2025.06:  🎉🎉 Our Temporal Feature Matters is accepted to TPAMI.
  • 2025.05:  🎉🎉 Our HarmoniCa is accepted to ICML.
  • 2024.10:  🎉🎉 Our LLMC is accepted to EMNLP Industry Track.
  • 2024.07:  🎉🎉 Our PTSBench is accepted to ACM MM.
  • 2024.02:  🎉🎉 Our TFMQ-DM is accepted to CVPR as a Highlight Poster.

Publications

* equal contribution  ·  📧 corresponding author  ·  full list sorted by date.

CVPR 2026 Highlight
LinVideo

LinVideo: A Post-Training Framework towards $\mathcal{O}(n)$ Attention in Efficient Video Generation

Yushi Huang, Xingtong Ge, Ruihao Gong📧, Chengtao Lv, Jun Zhang📧

CVPR 2026
MoDES

MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping

Yushi Huang, Zining Wang, Zhihang Yuan📧, Yifu Ding, Ruihao Gong, Jinyang Guo, Xianglong Liu, Jun Zhang📧

ICLR 2026
QVGen

QVGen: Pushing the Limit of Quantized Video Generative Models

Yushi Huang, Ruihao Gong📧, Jing Liu, Yifu Ding, Chengtao Lv, Haotong Qin, Jun Zhang📧

Arxiv 2026
RTDMD

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

Yushi Huang*, Xiangxin Zhou*, Ruoyu Wang*, Chi Zhang, Jun Zhang, Tianyu Pang📧

TPAMI 2025
Temporal Feature Matters

Temporal Feature Matters: A Framework for Diffusion Model Quantization

Yushi Huang, Ruihao Gong, Xianglong Liu📧, Jing Liu, Yuhang Li, Jiwen Lu, Dacheng Tao

ICML 2025
HarmoniCa

HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration

Yushi Huang*, Zining Wang*, Ruihao Gong📧, Jing Liu, Xinjie Zhang, Jinyang Guo, Xianglong Liu, Jun Zhang📧

CVPR 2024 Highlight
TFMQ-DM

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang*, Ruihao Gong*, Jing Liu, Tianlong Chen, Xianglong Liu📧

EMNLP 2024
LLMC

LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit

Ruihao Gong*, Yang Yong*, Shiqiao Gu*, Yushi Huang*, Chengtao Lv, Yunchen Zhang, Dacheng Tao, Xianglong Liu📧

Projects

Toolkit
LightCompress

LightCompress is an off-the-shelf compression suite for AIGC models (LLMs, VLMs, diffusion, etc.) that packages SOTA quantization, sparsification, and deployment best practices to shrink models while preserving accuracy. 700+ GitHub Stars.

GitHub

Honors & Services

Education

The Hong Kong University of Science and Technology
Ph.D. in Electronic and Computer Engineering · Advised by Prof. Jun Zhang
2025.02 – Now
Beihang University
B.Eng. in Computer Science and Technology · Shenyuan Honors College
2020.09 – 2024.06

Experience

Tencent · Hunyuan
Research Intern
RL for image/video generative models
2026.02 – Now
ByteDance · Seed
Research Intern
Inference acceleration for multimodal LLMs
2025.09 – 2025.11
Microsoft Research Asia
Research Intern
Video generation and world models
2024.12 – 2025.02
SenseTime Research
Research Intern
Compression and acceleration for image/video diffusion models and LLMs
2023.05 – 2025.03