MyPhoto_madebyAI

This repo is for an AI service project that provides ID photos and profile photos. The service is based on the Stable Diffusion model and the Dreambooth model. The Stable Diffusion model is a generative model that can generate high-quality images. The Dreambooth model is a fine-tuned version of the Stable Diffusion model that can generate images with a specific style.

Model files are not included in this repo.

I plan to serve this through an app or website to create a revenue structure. If you're interested in collaborating or just want to get your personalized result, please don't hesitate contact me at hwk06023@gmail.com.

Any feedback or advice is welcome.

Outline

Fine-tuning the RealisticVision model of Stable Diffusion with Dreambooth
Effective service pipeline through a generic model

Fine-tuning the RealisticVision model of Stable Diffusion with Dreambooth

Architectrue example

Hyperparameters

Prior_loss_weight(loss func’s lambda value)

A constant value is multiplied by the class loss, serving as regularization to prevent overfitting to the instance. However, due to the issue of becoming less similar to the instance (underfitting), I gradually reduced the value from the 1.0 suggested in the paper and tested it.

Learning rate, Training steps

By appropriately adjusting these two values, I was get the better result about underfitting and overfitting.

Application

Recontextualization (Text prompt)

The sentences were generated in the format specified by the authors, "a [V] [class noun] [context description]." Furthermore, for the V (Instance noun) to fit better, the name part was composed of less commonly used words by applying a Caesar cipher.

LoRA (Low Rank Adaptation)

Rather than updating the entire model's weights, effective fine-tuning is achieved by only updating additional parameters using the Low-rank technique. This approach results in significant savings of time and resources.

It was convenient to apply since it is available with Dreambooth on Huggingface.

Result - ID Photo

However, I thought it was impractical from a business standpoint to conduct fine-tuning for personalized models every time the number of users increases. Therefore, an effective service pipeline is needed through a generic model.

Effective service pipeline through a generic model

Service pipeline

Creating Reference Profile Photos with a Stable Diffusion-Based Model
Getting the subject's pose with DWPose
Using the image2text model to obtain information on the subject's attire, expression, and background to input into the prompt
Feeding IP Adapter and Open-pose data into ControlNet, which is based on the Realistic Vision model (Generic model), to generate a temporary profile photo.
Completing the image by enhancing the detailed facial information with FaceSwapLab's INSwapper

DWPose example

Result - Profile Photo

Previously, fine-tuning for each user took about 30 minutes due to the time required for the process. However, this pipeline's process could be completed in around just 1 minute, which made it significantly meaningful.

However, there were drawbacks in terms of performance. The results appeared somewhat Westernized, leading me to conclude that the Realistic Vision model used was biased due to being fine-tuned with predominantly Western data. Consequently, I wanted to retest using a Realistic Vision model trained primarily on Asian data, but I could only find models trained mainly on female Asian data.

Realistically, if data and resources are available, it would be more effective in the long run to build and service our own generic Realistic Vision models based on different races and genders.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
img		img
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyPhoto_madebyAI

Outline

Fine-tuning the RealisticVision model of Stable Diffusion with Dreambooth

Architectrue example

Hyperparameters

Prior_loss_weight(loss func’s lambda value)

Learning rate, Training steps

Application

Recontextualization (Text prompt)

LoRA (Low Rank Adaptation)

Result - ID Photo

Effective service pipeline through a generic model

Service pipeline

DWPose example

Result - Profile Photo

About

Releases

Packages

hwk06023/MyPhoto_madebyAI

Folders and files

Latest commit

History

Repository files navigation

MyPhoto_madebyAI

Outline

Fine-tuning the RealisticVision model of Stable Diffusion with Dreambooth

Architectrue example

Hyperparameters

Prior_loss_weight(loss func’s lambda value)

Learning rate, Training steps

Application

Recontextualization (Text prompt)

LoRA (Low Rank Adaptation)

Result - ID Photo

Effective service pipeline through a generic model

Service pipeline

DWPose example

Result - Profile Photo

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages