PIXELS - Progressive Image Xemplar-based Editing with Latent Surgery

Shristi Das Biswas, Matthew Shreve, Xuelu Li, Prateek Singhal, Kaushik Roy
Purdue University, Amazon.com
Recent advancements in language-guided diffusion models for image editing are often bottle-necked by cumbersome prompt engineering to precisely articulate desired changes. A more intuitive alternative calls on guidance from in-the-wild exemplars to help users draw inspiration and bring their imagined edits to life. Contemporary exemplar-based editing methods shy away from leveraging the rich latent space learnt by pre-existing large text-to-image (TTI) models and fall back on training with curated objective functions to achieve the task, which though somewhat effective, demands significant computational resources and lacks compatibility with diverse base models and arbitrary exemplar count. On further investigation, we also find that these techniques enable user control over the degree of change in an image edit limited only to global changes over the entire edited region. In this paper, we introduce a novel framework for progressive exemplar-driven editing with off-the-shelf diffusion models, dubbed PIXELS, to enable customization by providing granular control over edits, allowing adjustments at the pixel or region level. Our method operates solely during inference to facilitate imitative editing, enabling users to draw inspiration from a dynamic number of reference images and progressively incorporate all the desired edits without retraining or fine-tuning existing generation models. This capability of fine-grained control opens up a range of new possibilities, including selective modification of individual objects and specifying gradual spatial changes. We demonstrate that PIXELS delivers high-quality edits efficiently, outperforming existing methods in both exemplar-fidelity and visual realism through quantitative comparisons and a user study. By making high-quality image editing more accessible, PIXELS has the potential to enable professional-grade edits to a wider audience with the ease of using any open-source generation model.

Requirements

Python (version 3.9)
GPU (NVIDIA CUDA compatible)
Virtualenv (optional but recommended)

Installation

Create a virtual environment (optional but recommended):

conda create --name pixels
Activate the virtual environment:

conda activate pixels
Install the required dependencies:

pip install -r requirements.txt

Usage

Ensure that your virtual environment is activated.
Make sure that your GPU is properly set up and accessible.
For Stable Diffusion XL:
- Run the script:
  
  python SDXL/inference.py
For Stable Diffusion 2.1:
- Run the script:
  
  python SD2/inference.py
For Kandinsky 2.2:
- Run the script:
  
  python Kandinsky/inference.py

Citation

to be filled

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Kandinsky		Kandinsky
SD2		SD2
SDXL(recommended)		SDXL(recommended)
assets		assets
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Config		Config
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
THIRD-PARTY-LICENSES.txt		THIRD-PARTY-LICENSES.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PIXELS - Progressive Image Xemplar-based Editing with Latent Surgery

Table of Contents

Requirements

Installation

Usage

Citation

Security

License

About

Releases

Packages

Contributors 2

Languages

License

amazon-science/PIXELS

Folders and files

Latest commit

History

Repository files navigation

PIXELS - Progressive Image Xemplar-based Editing with Latent Surgery

Table of Contents

Requirements

Installation

Usage

Citation

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages