research

Currently, I'm collaborating with with Yining Hong and Andrew Lizarraga on applying vision-language models to collaborative embodied agents. Previously, I worked on datasets for action-driven video generation as well as memory-augmented video generation.

publications

EMBODIED WEB AGENTS: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Yining Hong*; Rui Sun*; Bingxuan Li†; Xingcheng Yao†; Maxine Wu†; Alexander Chien†; Da Yin; Ying Nian Wu; Zhecan James Wang; Kai-Wei Chang
Neurips 2025 Dataset & Benchmarks [Project Page] [Paper]

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong; Beide Liu*; Maxine Wu*; Yuanhao Zhai; Kai-Wei Chang; Lingjie Li; Kevin Lin; Chung-Ching Lin; Jianfeng Wang; Zhengyuan Yang††; Yingnian Wu††; Lijuan Wang††
ICLR 2025 [Project Page] [Paper]