Salient region detection in the task of visual question answering

Margarita Favorskaya; Vladimir Andreev; Aleksei Popov

doi:10.1088/1757-899X/450/5/052017

IOP Conference Series: Materials Science and Engineering

Paper • The following article is Open access

Salient region detection in the task of visual question answering

Margarita Favorskaya¹, Vladimir Andreev¹ and Aleksei Popov¹

Published under licence by IOP Publishing Ltd
IOP Conference Series: Materials Science and Engineering, Volume 450, Issue 5 Citation Margarita Favorskaya et al 2018 IOP Conf. Ser.: Mater. Sci. Eng. 450 052017 DOI 10.1088/1757-899X/450/5/052017

Download Article PDF

Article metrics

147 Total downloads

Author e-mails

favorskaya@sibsau.ru

Author affiliations

¹ Reshetnev Siberian State University of Science and Technology, 31 Krasnoyarsky Rabochy ave., Krasnoyarsk, 660037 Russian Federation

Buy this article in print

Journal RSS

Sign up for new issue notifications

Abstract

Salient region detection in Visual Question Answering (VQA) is an attempt to simulate a human ability to quickly perceive a scene by selectively looking on image fragments instead of processing a whole scene. The conventional approach deals with a neural network application. However, the Convolutional Neural Networks (CNNs) have many disadvantages compared with traditional methods for salient region detection. We modified the basic algorithm of salient region detection for VQA task by selecting such image fragments, which have a high probability to be included in a questionnaire. The experiments have been conducted on images from MS-COCO dataset and provided good segmentation results.

Export citation and abstract BibTeX RIS

Previous article in issue

Next article in issue

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Please wait… references are loading.

Salient region detection in the task of visual question answering

Article metrics

Share this article

Author e-mails

Author affiliations

Abstract