My research interests include large-scale video generation, audio-visual foundation models, and unified multi-modal models.