• No results found

Object Counts! Bringing Explicit Detections Back into Image Captioning

N/A
N/A
Protected

Academic year: 2020

Share "Object Counts! Bringing Explicit Detections Back into Image Captioning"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

Loading

Figure

Figure 1: Using explicit detections as an intermediate step towards end-to-end image captioning
Table 1: CIDEr scores for image captioning using bagof objects variants as visual representations
Table 2: CIDEr scores for captioning comparing theuse of min, max or average pooling of either object sizeor distance features, using ground truth annotations.
Table 3: ksents a subset of 2301 samples where all the 5 neigh-bours have-Nearest Neighbour (k=5) trial on the groundtruth bag of objects (Freq.) and the projected bag ofobjects (Proj.) representations
+7

References

Related documents