|
|
Mathematics of Artificial Intelligence
November 25, 2025 17:00, Moscow, Skolkovo Institute of Science and Technology, Bolshoy Boulevard, 30, p.1
|
|
|
|
|
|
|
Computational optimal transport and generative
modeling for image-to-image translation
D. Selikhanovich |
|
Abstract:
The thesis considers the problem of image-to-image (I2I) translation as the problem of machine learning, where given sets from two image domains, the goal is to construct the mapping from the first domain to the second
domain with the generalization property on new data. This problem has many
practical applications in imaging, including image editing, image enhancement,
and image synthesis. Its modern solutions, which achieve high realism, use generative models. Among them, the most known have become generative adversarial networks (GANs). However, GANs have several significant disadvantages in
solving the problem in an unpaired setting. The thesis proposes a methodology
to evaluate the realism-similarity trade-off for GANs, which shows that the good
trade-off for these models requires an exhaustive search of loss hyperparameters
and large computational resources. To solve these problems, the thesis develops
a new algorithm based on the computation of optimal transport (OT) mappings
for the Monge’s problem using deep neural networks. The numerical evaluation
shows that the proposed algorithm achieves a better realism-similarity trade-off
without loss hyperparameters compared with GANs for style translation and
object synthesis problems. Motivated by the diversity property, which improves
the realism for GANs in multimodal I2I translation problems, the thesis proposes a novel regularization to modify the derived algorithm based on kernel
variance, which stimulates the OT mappings to be stochastic. Numerical experiments show that the proposed regularization leads to better realism compared
to one-to-many GANs. Finally, the thesis considers another approach for the diversity of I2I translation maps - entropic regularization for the OT problem and
the Schr¨odinger Bridge (SB) problem. Existing SB-based models for unpaired
I2I translation problems have limited practical applications due to the need for
simulation with dozens or hundreds of iterative diffusion steps. Recently, a theoretical Discrete Iterative Markovian Fitting (IMF) procedure was proposed for
learning the SB model in discrete time between arbitrary data domains, which
in theory decreases the number of iterative steps for the simulation. The thesis proposes the effective implementation of the theoretical D-IMF procedure
to solve the SB problem using adversarial learning in application for unpaired
I2I translation problems. The numerical results show that the proposed implementation improves the realism of the translated images compared to the IMF
algorithm in continuous time using only four iterative steps for the generation
process instead of hundreds.
Website:
https://vc.skoltech.ru/b/ele-pyk-eib-06r
|
|