Logical Implications for Visual Question Answering Consistency

Tascon-Morales, Sergio; Márquez-Neila, Pablo; Sznitman, Raphael (2023). Logical Implications for Visual Question Answering Consistency. In: IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). Vancouver. Jun 18-22, 2023.

[img]
Preview
Text
2303.09427.pdf - Submitted Version
Available under License Creative Commons: Attribution (CC-BY).

Download (1MB) | Preview

Despite considerable recent progress in Visual Question Answering (VQA) models, inconsistent or contradictory answers continue to cast doubt on their true reasoning capabilities. However, most proposed methods use indirect strategies or strong assumptions on pairs of questions and answers to enforce model consistency. Instead, we propose a novel strategy intended to improve model performance by directly reducing logical inconsistencies. To do this, we introduce a new consistency loss term that can be used by a wide range of the VQA models and which relies on knowing the logical relation between pairs of questions and answers. While such information is typically not available in VQA datasets, we propose to infer these logical relations using a dedicated language model and use these in our proposed consistency loss function. We conduct extensive experiments on the VQA Introspect and DME datasets and show that our method brings improvements to state-of-the-art VQA models while being robust across different architectures and settings.

Item Type:

Conference or Workshop Item (Paper)

Division/Institute:

10 Strategic Research Centers > ARTORG Center for Biomedical Engineering Research
10 Strategic Research Centers > ARTORG Center for Biomedical Engineering Research > ARTORG Center - AI in Medical Imaging Laboratory

Graduate School:

Graduate School for Cellular and Biomedical Sciences (GCB)

UniBE Contributor:

Tascon Morales, Sergio, Márquez Neila, Pablo, Sznitman, Raphael

Subjects:

500 Science > 570 Life sciences; biology
600 Technology > 610 Medicine & health
000 Computer science, knowledge & systems
100 Philosophy > 160 Logic

Funders:

[4] Swiss National Science Foundation

Language:

English

Submitter:

Sergio Tascon Morales

Date Deposited:

24 May 2023 11:44

Last Modified:

07 Jul 2023 09:54

Related URLs:

ArXiv ID:

2303.09427

BORIS DOI:

10.48350/182869

URI:

https://boris.unibe.ch/id/eprint/182869

Actions (login required)

Edit item Edit item
Provide Feedback