Ruihan Yang

Ruihan Yang

Senior Researcher ยท Tencent America

Ph.D. UC Irvine

I am a Senior Researcher at Tencent America. My current research centers on multimodal understanding, with a particular emphasis on vision-language models. I am also actively exploring computer use agents. On the side, I have been experimenting with text-to-speech generation.

I received my PhD in Computer Science at UC Irvine, where I had been conducting research since Fall 2019 under the supervision of Prof. Stephan Mandt. My research primarily focuses on developing neural-based approaches for image/video compression and generation.

Before joining UC Irvine, I earned my bachelor's degree in Computer Science from NYU Shanghai. During my undergraduate studies, I had the privilege of working as a research assistant with Prof. Gus Xia, where I investigated generative modeling applications in music. Additionally, I collaborated with Prof. Hanghui Chen on physics research, gaining valuable interdisciplinary experience.

Vision-Language Models Computer Use Agent Multimodal Learning Image/Video Compression Image/Video Generation Diffusion Generative Models

Latest Publication

Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Technical Report, 2026 Boqiang Zhang*, Lei Ke*, Ruihan Yang*, Qi Gao*, Tianyuan Qu*, Rossell Chen, Dong Yu, Leoweiliang