Learning Visual Actions Using Multiple Verb-Only Labels

Wray, Michael; Damen, Dima

Computer Science > Computer Vision and Pattern Recognition

arXiv:1907.11117 (cs)

[Submitted on 25 Jul 2019 (v1), last revised 1 Aug 2019 (this version, v2)]

Title:Learning Visual Actions Using Multiple Verb-Only Labels

Authors:Michael Wray, Dima Damen

View PDF

Abstract:This work introduces verb-only representations for both recognition and retrieval of visual actions, in video. Current methods neglect legitimate semantic ambiguities between verbs, instead choosing unambiguous subsets of verbs along with objects to disambiguate the actions. We instead propose multiple verb-only labels, which we learn through hard or soft assignment as a regression. This enables learning a much larger vocabulary of verbs, including contextual overlaps of these verbs. We collect multi-verb annotations for three action video datasets and evaluate the verb-only labelling representations for action recognition and cross-modal retrieval (video-to-text and text-to-video). We demonstrate that multi-label verb-only representations outperform conventional single verb labels. We also explore other benefits of a multi-verb representation including cross-dataset retrieval and verb type manner and result verb types) retrieval.

Comments:	Accepted at BMVC 2019. More information can be found at this https URL. Annotations can be found at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1907.11117 [cs.CV]
	(or arXiv:1907.11117v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1907.11117

Submission history

From: Michael Wray [view email]
[v1] Thu, 25 Jul 2019 14:58:34 UTC (4,422 KB)
[v2] Thu, 1 Aug 2019 14:13:08 UTC (4,412 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 1907

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Michael Wray
Dima Damen

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Visual Actions Using Multiple Verb-Only Labels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Visual Actions Using Multiple Verb-Only Labels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators