Welcome to Qi-U Community for programmer and developer-Open, Learning and Share


0 votes
in Technique[技术] by (31.9m points)

deep learning - How can tasks that aren't Image-to-image translation work with Pix2pix?

zi2zi, a Chinese alphabet generating GAN uses pix2pix for generating images. I also have seen many other applications using pix2pix for tasks that aren't related to image-to image translation. I compared the code of zi2zi with regular pix2pix, and found some implementation that I couldn't understand.

  1. What is the target source and where is the random noise? Unlike image-to-image translation tasks where there exists an obvious target image, what is supposed to be the target source for character generation?

  2. Suppose the output of the encoder portion of the unet is the latent space, then how are we supposed to set the latent space to a certain value for evaluation, exploration of the latent space while the decoder is effected by skip-connections of the encoder network?

  3. I want to ask how pix2pix generalizes with these types of problems pix2pix isn't meant to be a powerful solution.

Welcome To Ask or Share your Answers For Others

Please log in or register to answer this question.

1 Answer

0 votes
by (31.9m points)

After digging in the code for a few hours I discovered how zi2zi utilizes the pix2pix methodology. If I am correct, the data is split into two parts: real_A and real_B. real_A is fed into the generator along with the class label embedding_ids and produces fake_b. The discriminator then aims at discriminating a fake_b and real_b with real_a as the target image.

Conclusively, this seemingly works like an autoencoder, but with the discriminator as an evaluation metric. In concept, there isn't much that is a difference between pix2pix and other GANs with encoders.

enter image description here

Welcome to Qi-U Community for programmer and developer-Open, Learning and Share