5.《DreamText: High Fidelity Scene Text Synthesis》
paper: https://arxiv.org/abs/2405.14701
code: https://github.com/CodeGoat24/DreamText
6.《TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation》
paper: https://github.com/ByteFlow-AI/TokenFlow
code: https://arxiv.org/pdf/2412.03069
视频生成
1.《High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model》
paper: https://arxiv.org/abs/2502.19894
code: https://github.com/MingtaoGuo/Relightable-Portrait-Animation
2.《Identity-Preserving Text-to-Video Generation by Frequency Decomposition》
paper:https://arxiv.org/abs/2411.17440
code: https://github.com/PKU-YuanGroup/ConsisID
3.《WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model》
paper: https://arxiv.org/abs/2411.17459
code: https://github.com/PKU-YuanGroup/WF-VAE
1.《CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion》
paper: https://arxiv.org/pdf/2412.01792
code: https://ihe-kaii.github.io/CTRL-D/
2.《Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing》
paper: https://arxiv.org/abs/2411.16832
code: https://github.com/taco-group/FaceLock
3.《 h h h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h h h-Transform》
paper: https://arxiv.org/abs/2503.02187
code: https://github.com/nktoan/h-edit
4.《EmoEdit: Evoking Emotions through Image Manipulation》
paper: https://arxiv.org/pdf/2405.12661
code: https://github.com/JingyuanYY/EmoEdit
5.《Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing》
paper: https://arxiv.org/abs/2411.16832
code: https://github.com/taco-group/FaceLock