Skip to main navigation Skip to search Skip to main content

Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS

  • Jakub Lokoč
  • , Stelios Andreadis
  • , Werner Bailer
  • , Aaron Duane
  • , Cathal Gurrin
  • , Zhixin Ma
  • , Nicola Messina
  • , Thao-Nhu Nguyen
  • , Ladislav Peška
  • , Luca Rossetto
  • , Loris Sauter
  • , Konstantin Schall
  • , Klaus Schoeffmann
  • , Omar Shahbaz Khan
  • , Florian Spiess
  • , Lucia Vadicamo
  • , Stefanos Vrochidis

Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

Abstract

This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is presented in connection with an analysis of the performance of participating teams. Specifically, both high-level performance indicators are presented with overall statistics as well as in-depth analysis of the performance of selected tools implementing result set logging. The analysis reveals evidence that the CLIP model represents a versatile tool for cross-modal video retrieval when combined with interactive search capabilities. Furthermore, the analysis investigates the effect of different users and text query properties on the performance in search tasks. Last but not least, lessons learned from search task preparation are presented, and a new direction for ad-hoc search based tasks at Video Browser Showdown is introduced.
Original languageEnglish
JournalMultimedia Systems
Volume29
Issue number6
Pages (from-to)3481-3504
Number of pages24
ISSN1432-1882
DOIs
Publication statusPublished - 24 Aug 2023

Keywords

  • Interactive video retrieval
  • Video browsing
  • Video content analysis
  • Content-based retrieval
  • Evaluations

Fingerprint

Dive into the research topics of 'Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS'. Together they form a unique fingerprint.

Cite this