In recent years, Generative Adversarial Networks (GANs) have become extensively popular for a lot of computer vision tasks. GANs have increasingly been used for numerous image editing and enhancement applications such as dust and scratch removal [R1], image colorization [R2], and unwanted object removal and watermark removal [R3].

Scratch Removal. Image Taken from [R1]

Image Colorization. Image Taken from [R2]

Image Inpainting. Image taken from [R3]

GANs for Fashion Image Editing

GANs have also found their way into numerous fashion image editing tasks. For instance, we could change the hair color of a model to increase diversity, or change their face expression to something more cheerful. Using techniques such as latent space exploration, we can edit “fake” images generated from pre-trained GANs. But, due to the limited expressiveness of the latent space of a pre-trained generator, it is difficult to perfectly project “real” images into the latent space. Hence, limiting the applicability of this approach to “fake” image editing only.

Figure 1. Fake Image Editing Using Latent Space Exploration

Another class of GANs, namely Image to Image translation networks [R4] have demonstrated high quality results for real image editing. Training is carried out with paired image datasets where the source domain and the target domain are expected to be semantically aligned. The network is then trained using a combination of adversarial loss, pixel-reconstruction loss and perceptual loss.

However, it is difficult to obtain aligned datasets for many practical use cases. For instance, obtaining a real dataset for changing model hair color is very difficult. It is physically impossible to change the hair color of a person and capture images in the exact same pose and lighting conditions. To overcome these limitations, we train the network using “fake” image pairs generated using latent space exploration of a pre-trained GAN.

Figure 2. Image to Image Translation Network trained using Fake Image pairs

Results for “real” image hair color editing

Results for “real” image expression editing

Apart from editing faces, there are numerous other applications for using GANs for image editing in the fashion domain. Recently, Kınlı et al. [R6] demonstrate encouraging results for fashion clothing inpainting. Generally, inpainting networks are modeled in an adversarial training setup with a few modifications to the generator network. Typically, the convolution operation is redesigned to address the missing information in the holes present in the input image. Also, attention layers are incorporated to capture the relative importance of various visible features in the input image.

Fashion Clothing Inpainting. Image taken from [R6]

References

[1] Ionuţ Mironică. “A Generative Adversarial Approach with Residual Learning for Dust and Scratches Artifacts Removal”

[2] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros. “Image-to-Image Translation with Conditional Adversarial Networks.” In CVPR, 2017

[3] Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas Huang. “Free-Form Image Inpainting with Gated Convolution.” In ICCV, 2019

[4] Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro. “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs.” In CVPR, 2018

[5] Rameen Abdal, Yipeng Qin, Peter Wonka. “Image2StyleGAN++: How to Edit the Embedded Images?”. In CVPR, 2020

[6] Furkan Kınlı, Barış Özcan, Furkan Kıraç. “A Benchmark for Inpainting of Clothing Images with Irregular Holes.” In Advanced Image Manipulation workshop and challenges, ECCV, 2020