I am a Ph.D. student at the Hong Kong University of Science and Technology (HKUST), supervised by Prof. Jun Zhang. I received my B.E. degree from Beihang University. My research interest is building efficient and high-performing generative systems. I currently work on RL for efficient image/video generation. Previously, I worked on improving inference efficiency for vision and language generative models, including low-precision inference, computation skipping, efficient attention, etc.
I am always happy to chat about research and potential collaborations — feel free to reach out.
News
- 2026.05: 🎉🎉 Our Light Forcing, SGMD, and Flash-VAED are accepted to ICML.
- 2026.04: 🎉🎉 Our LinVideo is selected as a Highlight Poster.
- 2026.04: 🎉🎉 Our Focus-dLLM is accepted to ACL (Main).
- 2026.02: 🎉🎉 Our MoDES and LinVideo are accepted to CVPR.
- 2026.01: 🎉🎉 Our QVGen is accepted to ICLR.
- 2025.11: 🎉🎉 Our SlimInfer and LLMC+ are accepted to AAAI.
- 2025.06: 🎉🎉 Our Temporal Feature Matters is accepted to TPAMI.
- 2025.05: 🎉🎉 Our HarmoniCa is accepted to ICML.
- 2024.10: 🎉🎉 Our LLMC is accepted to EMNLP Industry Track.
- 2024.07: 🎉🎉 Our PTSBench is accepted to ACM MM.
- 2024.02: 🎉🎉 Our TFMQ-DM is accepted to CVPR as a Highlight Poster.
Publications
* equal contribution · 📧 corresponding author · full list sorted by date. Citations:

LinVideo: A Post-Training Framework towards $\mathcal{O}(n)$ Attention in Efficient Video Generation
Yushi Huang, Xingtong Ge, Ruihao Gong📧, Chengtao Lv, Jun Zhang📧

MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
Yushi Huang, Zining Wang, Zhihang Yuan📧, Yifu Ding, Ruihao Gong, Jinyang Guo, Xianglong Liu, Jun Zhang📧

QVGen: Pushing the Limit of Quantized Video Generative Models
Yushi Huang, Ruihao Gong📧, Jing Liu, Yifu Ding, Chengtao Lv, Haotong Qin, Jun Zhang📧

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching
Yushi Huang*, Xiangxin Zhou*, Ruoyu Wang*, Chi Zhang, Jun Zhang, Tianyu Pang📧

Temporal Feature Matters: A Framework for Diffusion Model Quantization
Yushi Huang, Ruihao Gong, Xianglong Liu📧, Jing Liu, Yuhang Li, Jiwen Lu, Dacheng Tao

Yushi Huang*, Zining Wang*, Ruihao Gong📧, Jing Liu, Xinjie Zhang, Jinyang Guo, Xianglong Liu, Jun Zhang📧

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang*, Ruihao Gong*, Jing Liu, Tianlong Chen, Xianglong Liu📧

LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
Ruihao Gong*, Yang Yong*, Shiqiao Gu*, Yushi Huang*, Chengtao Lv, Yunchen Zhang, Dacheng Tao, Xianglong Liu📧
Projects

LightCompress is an off-the-shelf compression suite for AIGC models (LLMs, VLMs, diffusion, etc.) that packages SOTA quantization, sparsification, and deployment best practices to shrink models while preserving accuracy. 700+ GitHub Stars.
Honors & Services
- Conference Reviewer: NeurIPS, ICLR, ICML, COLM, AAAI, CVPR, ECCV.
Education
Experience