MACHINE_LEARN Telegram 3539
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

20 Mar 2025 · Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Hao Kang, Xin Lu ·

Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce InfiniteYou (InfU), one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.

Paper: https://arxiv.org/pdf/2503.16418v1.pdf

Code: https://github.com/bytedance/infiniteyou

Dataset: 10,000 People - Human Pose Recognition Data

@Machine_learn



tgoop.com/Machine_learn/3539
Create:
Last Update:

InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

20 Mar 2025 · Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Hao Kang, Xin Lu ·

Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce InfiniteYou (InfU), one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.

Paper: https://arxiv.org/pdf/2503.16418v1.pdf

Code: https://github.com/bytedance/infiniteyou

Dataset: 10,000 People - Human Pose Recognition Data

@Machine_learn

BY Machine learning books and papers




Share with your friend now:
tgoop.com/Machine_learn/3539

View MORE
Open in Telegram


Telegram News

Date: |

A vandalised bank during the 2019 protest. File photo: May James/HKFP. The best encrypted messaging apps A Telegram channel is used for various purposes, from sharing helpful content to implementing a business strategy. In addition, you can use your channel to build and improve your company image, boost your sales, make profits, enhance customer loyalty, and more. Polls A few years ago, you had to use a special bot to run a poll on Telegram. Now you can easily do that yourself in two clicks. Hit the Menu icon and select “Create Poll.” Write your question and add up to 10 options. Running polls is a powerful strategy for getting feedback from your audience. If you’re considering the possibility of modifying your channel in any way, be sure to ask your subscribers’ opinions first.
from us


Telegram Machine learning books and papers
FROM American