Top PDF visual question answering

Fusion of Detected Objects in Text for Visual Question Answering

... more visual object features we included in the model’s input, the better the model performed, even if they were not explicitly co-referent to the text, and that positional features of objects in the image were ...

10

Generating Question Relevant Captions to Aid Visual Question Answering

... Visual question answering (VQA) and image captioning require a shared body of general knowledge connecting language and vi- ...specific visual ques- ...ing question-relevant ...

10

Visual TTR Modelling Visual Question Answering in Type Theory with Records

... Visual question answering is a recent popular task in the field of computer ...a visual and linguistic ...and question in TTR and, subsequently, evaluation of the utterance with respect ...

6

Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding

... or visual information with vector representations trained from large language or visual datasets has been successfully explored in recent ...as visual question answering require ...

12

Segmentation Guided Attention Networks for Visual Question Answering

... of Visual Question Answering by using a novel segmentation guided attention based network which we call SegAttend- ...the question answering accuracy from ...

6

Dynamic Capsule Attention for Visual Question Answering

... In visual question answering (VQA), recent advances have well advocated the use of attention mechanism to precisely link the question to the potential answer ...the question increases, ...

8

KVQA: Knowledge-Aware Visual Question Answering

... Question answering about image, also popularly known as Visual Question Answering (VQA), has gained huge inter- est in recent years (Goyal et ...scale Visual Question ...

9

Faithful Multimodal Explanation for Visual Question Answering

... AI systems’ ability to explain their reasoning is critical to their utility and trustworthiness. Deep neural networks have enabled significant progress on many challenging problems such as visual question ...

10

Improving Visual Question Answering by Referring to Generated Paragraph Captions

... as visual question ...with visual information present in the image because it can discuss both more abstract concepts and more explicit, intermediate symbolic information about objects, events, and ...

7

The Meaning of “Most” for Visual Question Answering Models

... a visual scene requires non-trivial inference ...for visual question answering learn when trained on such ques- ...same question was investigated for humans. Focusing on the FiLM ...

10

Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects

... Visual question answering (VQA) models have been shown to over-rely on linguistic biases in VQA datasets, answering questions “blindly” without considering visual ...on ...

13

Cross-Modal Multistep Fusion Network with Co-Attention for Visual Question Answering

... ABSTRACT Visual question answering (VQA) is receiving increasing attention from researchers in both the computer vision and natural language processing ...and Question-guide Image Attention ...

9

ImageTTR: Grounding Type Theory with Records in Image Classification for Visual Question Answering

... We present ImageTTR, an extension to the Python implementation of Type Theory with Records (pyTTR) which connects formal record type representation with image classifiers implemented as deep neural networks. The Type ...

10

Data Augmentation for Visual Question Answering

... learning. Visual question answering (VQA) is a problem that fuses computer vision and NLP to build upon these ...a question about the image, and it pre- dicts the answer to the question ...

5

Psycholinguistics Meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering

... We study the issue of catastrophic forgetting in the context of neural multimodal approaches to Visual Question Answering (VQA). Moti- vated by evidence from psycholinguistics, we devise a set of ...

5

BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection

... standpoint: MLB (Kim et al. 2017) and MUTAN (Ben- Younes et al. 2017) constrain the tensor of parameters using respectively the CP and Tucker decomposition. In MFB (Yu et al. 2017b), the tensor is viewed as a stack of ...

8

Multi grained Attention with Object level Grounding for Visual Question Answering

... sentence-image alignment reports promising re- sults in general, it sometimes fails to locate small objects or understand a complicated scenario. For the example in Figure 1, the question is “What is the man ...

6

Stacking with Auxiliary Features for Visual Question Answering

... of question types. The questions were tokenized and a question type was formed by adding one token at a time, up to a maximum of five, to the current ...The question “What is the color of the vase?” ...

10

Analyzing the Behavior of Visual Question Answering Models

... the question (using Long Short-Term Memory (LSTM) recurrent neural network to obtain question ...and question features obtained from the two channels are combined and passed through a fully connected ...

6

The Promise of Premise: Harnessing Question Premises in Visual Question Answering

... a question that is ir- relevant to an image, state-of-the-art VQA models will still answer purely based on learned language biases, resulting in non- sensical or even misleading ...a visual question ...

10

visual question answering

Related subjects