About Me
I am currently a fourth-year Ph.D. student at Multimedia Computing Group (MCG) in Department of Computer Science and Technology, Nanjing University, supervised by Prof. Limin Wang. Before that, I received my bachelor degree from Nanjing University.
My current research interests lie in multimodal understanding and generation, including image/video grounding and synthesis.
🔥 News
- 2024.06: We release the code of Open-MAGVIT-2, which is an open source implementation of Google’s MAGVIT-2.
- 2024.02: 🎉🎉 One first-authored paper accepted by CVIU.
- 2024.02: 🎉🎉 One first-authored paper accepted by CVPR 2024.
- 2023.10: 🎉🎉 One first-authored paper accepted by T-PAMI.
📝 Selected Publications
Check out full publication list at my Google Scholar profile.
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
The IEEE Conference on Computer Vision and Pattern Recognition, 2024
Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei Zhang, Limin Wang
[ArXiv][Project Page][Code]
📝 Projects
OPEN-MAGVIT2: An Open-source Project Toward Democratizing Auto-Regressive Visual Generation
Zhuoyan Luo*, Fengyuan Shi*, Yixiao Ge, Yujiu Yang, Limin Wang, Ying Shan
[ArXiv][Code]
- An open-source replication of Google’s MAGVIT-v2 tokenizer with a super-large codebook (i.e., 2^18 codes), and it achieves the state-of-the-art reconstruction performance (1.17 rFID) on ImageNet 256 x 256.
- A family of auto-regressive image generation models ranging from 300M to 1.5B, which adapts super-large codebook into vanilla auto-regressive generation.
🎖 Honors and Awards
- President’s Special Scholarship for Doctoral Students of Nanjing University
- YeePay scholarship
- The people’s scholarship in China
📖 Educations
- 2021.09 - Now, Ph.D. in Computer Science and Technology, Nanjing University.
- 2017.09 - 2021.06, B.Sc. in Computer Science and Technology, Nanjing University.
🪪 Academic Service
-
Conference Reviewer: CVPR 2024.
-
Journal Reviewer: IEEE Transactions On Circuits and Systems; IET Computer Vision; IEEE Transactions on Circuits and Systems for Video Technology.
💻 Internships
- 2024.03 - Now, Research Intern, Tencent ARC Lab, Beijing, advised by Yixiao Ge.
- 2023.07 - 2024.02, Research Intern, Huawei Noah’s Ark Lab, Shanghai, advised by Hang Xu.
- 2023.04 - 2023.06, Research Intern, Ant Group, Hangzhou.