Stable Diffusion may soon have some competition when it comes to open-source image generators. In its latest iteration, PixArt becomes faster and more accurate while maintaining a relatively high resolution.
In a paper, researchers from Huawei Noah’s Ark Lab, Dalian University of Technology, Tsinghua University, and Hugging Face presented PixArt-δ (Delta), an advanced text-to-image synthesis framework designed to compete with the Stable Diffusion family.
This model is a significant improvement over the previous PixArt-α (Alpha) model, which was already able to quickly generate images with a resolution of 1024 x 1024 pixels.
High-resolution image generation in half a second
PixArt-δ integrates the Latent Consistency Model (LCM) and ControlNet into the PixArt-α model, significantly accelerating inference speed. The model can generate high-quality images with a resolution of 1,024 x 1,024 pixels in just two to four steps in as little as 0.5 seconds, seven times faster than PixArt-α.
Ad
THE DECODER Newsletter
The most important AI news straight to your inbox.
✓ Weekly
✓ Free
✓ Cancel at any time
Ad
THE DECODER Newsletter
The most important AI news straight to your inbox.
✓ Weekly
✓ Free
✓ Cancel at any time
Share
Recommend our article
SDXL Turbo, introduced by Stability AI in November 2023, can generate images of 512 x 512 pixels in just one step, or about 0.2 seconds.
However, PixArt-δ’s results are higher resolution and seem more consistent compared to SDXL Turbo and a four-step variant of SDXL with LCM. The images appear to have fewer errors and the model follows the instructions more accurately.
Image: Chen et al.The new PixArt model is designed to train efficiently on V100 GPUs with 32 GB of VRAM in less than a day. In addition, its 8-bit inference capability allows it to synthesize 1024-pixel images even on 8-GB GPUs, greatly improving its usability and accessibility.
More control over image generation
The integration of a ControlNet module into PixArt-δ allows finer control of text-to-image diffusion models using reference images. The researchers have introduced a novel ControlNet architecture specifically designed for transformer-based models that provide explicit controllability while maintaining high-quality image generation.
Image: Chen et al.The researchers have published the weights for the ControlNet variant of PixArt-δ on Hugging Face. However, an online demo seems to be available only for PixArt-α with and without LCM.
{Categories} _Category: Platforms,*ALL*{/Categories}
{URL}https://the-decoder.com/open-source-pixart-δ-image-generator-spits-out-high-resolution-ai-images-in-0-5-seconds/{/URL}
{Author}unknown{/Author}
{Image}https://the-decoder.com/wp-content/uploads/2024/01/PIXART-Delta-Header-Image-1200×675.png{/Image}
{Keywords}{/Keywords}
{Source}All{/Source}
{Thumb}{/Thumb}