How the model works ? Why Freckles and moles are lost ? #135

markimark21 · 2023-10-11T20:20:31Z

markimark21
Oct 11, 2023

Hi there,

freckles and moles are an important part of a face. Unfortunately right now this information is not considered, the faces are smooth with almost no skin imperfections. That could be irritating, if arms or other body parts display freckles.
Is it possible to keep such facial features when swapping faces? Maybe also scars and pimples?

By the way, kudos to Your work! Best extension ever 👍

Mark

glucauze · 2023-10-11T20:44:13Z

glucauze
Oct 11, 2023
Maintainer

The extension internally leverages InsightFace. This model employs a face embedding to represent the target face. This face embedding consists of points on the face (see image). As a result, it does not account for blemishes or wrinkles. This embedding is then provided to the model alongside the original image. The model then calculates how to transform the original image to match the embedding.

Photo taken from here : https://link.springer.com/article/10.1007/s00371-020-01960-z

From this, we can deduce two things:

Hair, blemishes, moles, and wrinkles are lost on the target. The model never sees them and therefore never recreates them.
The resolution issue that arises is not, contrary to what some may think, a problem of inputting an image with a resolution higher than 128 (the model takes the original image at 128). Using a higher resolution would improve the image quality, but it wouldn't bring back more details. The issue stems from using this face representation in the form of a loosely-defined embedding, which loses resolution.

In other words, one might attempt to retrain a model using larger images, but as long as the model doesn't take in higher-resolution face embedding inputs, the outcome will not be more accurate, just higher in resolution. And the same poor result can be obtained by upscaling (what this extension does).

The question we need to ask is: why does InsightFace operate in this manner? One reason, I believe, is that they already have efficient models for calculating embeddings (retinaface, ...). These models were originally designed for facial recognition, not for face swapping. This is why they do not take into account all facial features.

By doing so, it allows for rapid embedding calculations and reduces dependence on a vast number of reference photos. Often, a single reference photo is sufficient to compute the embedding.

The model they've made public is, in essence, more of a toy than anything else. It's a toy that performs well and can be enhanced with upscaling, but it's still a toy that doesn't produce highly ultra-credible results. I say that without taking any merit away from the work done by insightface team, which is really super cool.

I might be mistaken, but I believe the model's architecture is a GAN (Generative Adversarial Network) with a generator and a discriminator. The public model is the generator (without the implementation). Essentially, this is conceptually similar to StyleGAN. Retraining such a model would be costly. One would need to determine the generator's architecture (which is relatively straightforward) and also the discriminator's (which is more tedious since it's not provided). Additionally, one would have to discern the training technique used, which isn't supplied either.

However, the most significant challenge would be to create a face embedding model that genuinely represents the target face. This is no small feat. I don't know if the models insightface have developed for midjourney are better in this respect or if they're just better at upscaling the result. I haven't tested their bot on this aspect. It's very possible that other teams are working on a better representation of faces specifically for the face swapping task.

Now, given the very touchy nature of the field, I doubt this kind of model will be widely disseminated before face swapping becomes commonplace (for video conf/compression for example).

Another line of research would probably be to guide face generation in the manner of controlnet, by influencing the stable diffusion generation process.

1 reply

markimark21 Oct 12, 2023
Author

Thank You so much for the very detailed answer and explanations!

I wasn't aware of insightface model and all the obstacles on the way to get a real representation of a face. So I set my hope on some brilliant developers to reach this goal. Keep it up!

Rifz42 · 2023-10-22T01:05:56Z

Rifz42
Oct 22, 2023

thanks for the explanation, there are so many settings I thought I was doing something wrong.
is there a way to add canny control net or something, just for the face to make it work better?

0 replies

gaoyangyiqiao · 2023-10-29T14:52:26Z

gaoyangyiqiao
Oct 29, 2023

Thanks for the explanation. Quite detailed and clear, it's super !

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How the model works ? Why Freckles and moles are lost ? #135

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

How the model works ? Why Freckles and moles are lost ? #135

markimark21 Oct 11, 2023

Replies: 3 comments · 1 reply

glucauze Oct 11, 2023 Maintainer

markimark21 Oct 12, 2023 Author

Rifz42 Oct 22, 2023

gaoyangyiqiao Oct 29, 2023

markimark21
Oct 11, 2023

Replies: 3 comments 1 reply

glucauze
Oct 11, 2023
Maintainer

markimark21 Oct 12, 2023
Author

Rifz42
Oct 22, 2023

gaoyangyiqiao
Oct 29, 2023