Scene Construction from Depth Map Using Image-to-Image Translation Model

Avşar, Hasan; Avsar, Hasan; Sarıgül, Mehmet; Sarigul, Mehmet; Karacan, Levent; Karacan, Levent

doi:10.54856/jiswa.202205192

AKILLI SİSTEMLER VE UYGULAMALARI DERGİSİ
JOURNAL OF INTELLIGENT SYSTEMS WITH APPLICATIONS
J. Intell. Syst. Appl.

E-ISSN: 2667-6893

This work is licensed under a Creative Commons Attribution 4.0 International License.

Scene Construction from Depth Map Using Image-to-Image Translation Model

Görüntüden Görüntüye Dönüşüm Modeli Kullanılarak Derinlik Haritası Girdisiyle Sahne Görseli Üretimi

How to cite: Avşar H, Sarıgül M, Karacan L. Scene construction from depth map using image-to-image translation model. Akıllı Sistemler ve Uygulamaları Dergisi (Journal of Intelligent Systems with Applications) 2022; 5(1): 8-11.

Full Text: PDF, in English.

Total number of downloads: 468

Title: Scene Construction from Depth Map Using Image-to-Image Translation Model

Abstract: n recent years, deep learning approach to solve the image and video processing problems have become very popular. Generative Adversarial Networks (GANs) are one of the most popular deep learning-based models. GANs form a generative model utilizing two sub-models, namely, generator and discriminator. The generator tries to generate indistinguishably realistic outputs where the discriminator tires to classify the outputs of the generator as real or fake. These two models work together to achieve a successful generation of realistic outputs. This study aims to reconstruct daytime image of a given depth map data recorded with a camera or a sensor which can capture the depth map data during night time or in a lightless environment. Our model was used for reconstructing the 2D images for a given depth map representation of a known scene. The model was trained with the chess scene from 7-scenes dataset and realistic 2D images were successfully generated for the given input maps.

Keywords: deep learning; generative adversarial networks; Pix2pixHD

Başlık: Görüntüden Görüntüye Dönüşüm Modeli Kullanılarak Derinlik Haritası Girdisiyle Sahne Görseli Üretimi

Özet: Son yıllarda, görüntü ve video işleme problemlerinin çözümlerinde derin öğrenme tabanlı yaklaşımların kullanımı büyük bir popülerlik kazanmıştır. Üretici Çekişmeli Ağlar (Generative Adversarial Networks veya kısaca GAN), en çok tercih edilen derin öğrenme modelleri arasında bulunmaktadır. Üretici Çekişmeli Ağlar, üretici ve ayırt edici olmak üzere iki farklı alt modelin bir araya gelmesiyle oluşan üretici yapıya sahip derin modellerdir. Üretici alt modelin amacı gerçeğe en yakın çıktıları üretebilmek iken ayırt edici alt modelin amacı ise üretici alt model tarafından üretilen çıktıları gerçek veya yapay olarak etiketlemektir. Bu çalışmamızın amacı, ışıksız ortamda özel kamera veya sensörler ile elde edilen derinlik haritası verisini kullanarak ortamın renkli görsellerini üretmektir. Modelimiz, bilinen bir sahneye ait derinlik haritası verisinden renkli görsellerin üretimi yapılarak denenmiştir. 7-scenes veri seti içinde bulunan satranç sahnesi ile eğitilen modelimiz derinlik haritası girdilerinden renkli görsel üretimini başarıyla gerçekleştirmiştir.

Anahtar kelimeler: derin öğrenme; üretici çekişmeli ağlar; Pix2pixHD

Bibliography:

Yildirim O, Ucar A, Baloglu UB. Recognition of real-world texture images under challenging conditions With deep learning. Journal of Intelligent Systems with Applications 2018; 1(2): 122-126.
Schmidhuber J. Deep learning in neural networks: An overview. Neural Networks 2015; 61: 85-117.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015; 521: 436–444.
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y. Generative ddversarial nets. Advances in Neural Information Processing Systems 2014; 521: 436–444.
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B. High-resolution image synthesis and semantic manipulation with conditional GANs. In arXiv Preprint Archive on Computer Vision and Pattern Recognition 2018.
Isola P, Zhu JY, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In arXiv Preprint Archive on Computer Vision and Pattern Recognition 2017.