About
👋 Welcome to my homepage!
I'm an incoming Stanford MSCS student. Currently, I'm a fourth-year undergraduate studying Mathematics and Computer Science at University of Illinois Urbana-Champaign, expected to graduate in May 2026.
I am broadly interested in world modeling—how to build models that can understand and simulate the dynamic 3D world from visual data. My current work explores this direction from three complementary perspectives:
- Controllable image and video generation, focusing on how to precisely guide the generation process while maintaining high visual quality and structural fidelity.
- Learning structured 4D representations from images, videos, and limited 3D data, aiming to discover a unified representation that captures both static geometry and temporal dynamics.
- Leveraging these representations to model real-world processes, moving from visual synthesis toward more interpretable and physically grounded generation.
Besides, I have worked on several related directions, including adversarial attack and image immunization for image-to-video generation (advised by Prof. James M. Rehg), KG-guided LLM reasoning (ULab@UIUC, advised by Prof. Jiaxuan You), and autoregressive image generation and multimodal learning at the Vision and Multimodal Research Center of BAAI (Beijing Academy of Artificial Intelligence).
I’m always open to collaboration and excited to explore new research directions in AI. If you are working on related problems or have ideas for potential projects, feel free to reach out—I'd be glad to contribute!
News
📄 My first paper ✂️✂️ Follow-Your-Shape (EditAnyShape) was accepted to ICLR 2026!
