ComfyUITemplates.com
Discover free ready-made ComfyUI templates for AI workflows.
FLUX & ByteDance-USO: Single Img2Img
USO, built on FLUX.1-dev, is a ByteDance model that unifies style-driven and subject-driven image generation. It uses decoupled learning and style reward learning to achieve subject consistency, apply artistic styles, or combine both.
ComfyUI Workflow: FLUX & ByteDance-USO – Single Img2Img
This ComfyUI workflow integrates the USO (Unified Style-Subject Optimized) model developed by ByteDance. Built on the FLUX.1-dev architecture, USO unifies style-driven and subject-driven image generation in a single framework, achieving both high style similarity and consistent subject identity.
Key Capabilities
- Subject-driven generation: Places subjects into new scenes while consistently maintaining their identity.
- Style-driven generation: Applies artistic styles from reference images to new content.
- Combined mode: Uses both subject and style references together for unified, layout-preserving or layout-shifted transformations.
How It Works
USO tackles the challenge of unifying style and subject generation by disentangling and recombining “content” and “style” within a single training framework. The core design includes:
- Decoupled learning: Separate objectives for learning style characteristics and subject/content structure.
- Style Reward Learning (SRL): A reward-based paradigm that refines style similarity and overall visual quality.
- Disentangled learning scheme with two complementary objectives:
- Style-alignment training to align and learn robust style features.
- Content–style disentanglement training to keep content information distinct from stylistic elements for flexible recomposition.
- Large-scale triplet dataset of content images, style images, and their stylized counterparts used to supervise unified training.
Why Use This Workflow
- Unified solution: One workflow covers style transfer, subject-driven generation, and combined style+subject cases that are usually handled by separate models.
- Consistent results: Preserves subject identity while delivering strong style resemblance across varied prompts and layouts.
- Broad application: Ideal for character placement, branded illustrations, stylized portraits, and general artistic stylization from a small number of reference images.