VIRET at Video Browser Showdown 2020

Lokoč, Jakub; Kovalčík, Gregor; Souček, Tomáš

doi:10.1007/978-3-030-37734-2_70

Jakub Lokoč¹⁶,
Gregor Kovalčík¹⁶ &
Tomáš Souček¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

International Conference on Multimedia Modeling

2341 Accesses
11 Citations

Abstract

During the last three years, the most successful systems at the Video Browser Showdown employed effective retrieval models where raw video data are automatically preprocessed in advance to extract semantic or low-level features of selected frames or shots. This enables users to express their search intents in the form of keywords, sketch, query example, or their combination. In this paper, we present new extensions to our interactive video retrieval system VIRET that won Video Browser Showdown in 2018 and achieved the second place at Video Browser Showdown 2019 and Lifelog Search Challenge 2019. The new features of the system focus both on updates of retrieval models and interface modifications to help users with query specification by means of informative visualizations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The V3C1 dataset [18] is currently used at VBS.
2.
Authors of a tool are considered to be experts as they are expected to use the tool more effectively.
3.
https://cloud.google.com/speech-to-text/.

References

Amato, G., et al.: VISIONE at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 591–596. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_51
Chapter Google Scholar
Andreadis, S., et al.: VERGE in VBS 2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 602–608. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_53
Chapter Google Scholar
Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval - The Concepts and Technology Behind Search, 2nd edn. Pearson Education Ltd., Harlow (2011)
Google Scholar
Barthel, K.U., Hezel, N.: Visually exploring millions of images using image maps and graphs. In: Huet, B., Vrochidis, S., Chang, E. (eds.) Big Data Analytics for Large-Scale Multimedia Search, pp. 251–275. John Wiley and Sons Inc. (2019)
Google Scholar
Cobârzan, C., et al.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimed. Tools Appl. 76(4), 5539–5571 (2017). https://doi.org/10.1007/s11042-016-3661-2
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (June 2009). https://doi.org/10.1109/CVPR.2009.5206848
Dong, J., Li, X., Snoek, C.G.M.: Predicting visual features from text for image and video caption retrieval. IEEE Trans. Multimedia 20(12), 3377–3388 (2018). https://doi.org/10.1109/TMM.2018.2832602
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.org
MATH Google Scholar
Gurrin, C., et al.: [invited papers] Comparing approaches to interactive lifelog search at the lifelog search challenge (lsc2018). ITE Trans. Med. Technol. Appl. 7(2), 46–59 (2019). https://doi.org/10.3169/mta.7.46
Article Google Scholar
Li, X., Xu, C., Yang, G., Chen, Z., Dong, J.: W2VV++: fully deep learning for ad-hoc video search. In: Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, 21–25 October 2019, pp. 1786–1794 (2019). https://doi.org/10.1145/3343031.3350906
Lokoč, J., Bailer, W., Schoeffmann, K., Münzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimed. 20(12), 3361–3376 (2018). https://doi.org/10.1109/TMM.2018.2830110
Article Google Scholar
Lokoč, J., et al.: Interactive search or sequential browsing? A detailed analysis of the video browser showdown 2018. ACM Trans. Multimed. Comput. Commun. Appl. 15(1), 29:1–29:18 (2019). https://doi.org/10.1145/3295663
Article Google Scholar
Mettes, P., Koelma, D.C., Snoek, C.G.: The imagenet shuffle: Reorganized pre-training for video event detection. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp. 175–182. ICMR ’16, ACM, New York, NY, USA (2016). https://doi.org/10.1145/2911996.2912036, http://doi.acm.org/10.1145/2911996.2912036
Lokoč, J., Kovalčík, G., Souček, T., Moravec, J., Čech, P.: A framework for effective known-item search in video. In: Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, pp. 1777–1785, ACM, New York (2019). https://doi.org/10.1145/3343031.3351046
Lokoč, J., Kovalčík, G., Souček, T., Moravec, J., Čech, P.: Viret: a video retrieval tool for interactive known-item search. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, ICMR 2019, pp. 177–181. ACM, New York (2019). https://doi.org/10.1145/3323873.3325034
Nguyen, P.A., Ngo, C.-W., Francis, D., Huet, B.: VIREO @ video browser showdown 2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 609–615. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_54
Chapter Google Scholar
Rossetto, L., Amiri Parian, M., Gasser, R., Giangreco, I., Heller, S., Schuldt, H.: Deep learning-based concept detection in vitrivr. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 616–621. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_55
Chapter Google Scholar
Rossetto, L., Schuldt, H., Awad, G., Butt, A.A.: V3C – a research video collection. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11295, pp. 349–360. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05710-7_29
Chapter Google Scholar
Schoeffmann, K., Hudelist, M.A., Huber, J.: Video interaction tools: a survey of recent work. ACM Comput. Surv. 48(1), 14:1–14:34 (2015). https://doi.org/10.1145/2808796
Article Google Scholar
Schoeffmann, K., Münzer, B., Leibetseder, A., Primus, J., Kletz, S.: Autopiloting feature maps: the deep interactive video exploration (diveXplore) system at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 585–590. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_50
Chapter Google Scholar
Thomee, B., Lew, M.S.: Interactive search in image retrieval: a survey. Int. J. Multimed. Inf. Retrieval 1(2), 71–86 (2012). https://doi.org/10.1007/s13735-012-0014-4
Article Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR abs/1707.07012 (2017). http://arxiv.org/abs/1707.07012

Download references

Acknowledgments

This paper has been supported by Czech Science Foundation (GAČR) project 19-22071Y and by Charles University grant SVV-260451. We would also like to thank Přemysl Čech and Vít Škrhák for their help with interface in WPF.

Author information

Authors and Affiliations

SIRET Research Group, Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Jakub Lokoč, Gregor Kovalčík & Tomáš Souček

Authors

Jakub Lokoč
View author publications
You can also search for this author in PubMed Google Scholar
Gregor Kovalčík
View author publications
You can also search for this author in PubMed Google Scholar
Tomáš Souček
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jakub Lokoč .

Editor information

Editors and Affiliations

Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Yong Man Ro
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Junmo Kim
National Cheng Kung University, Tainan City, Taiwan
Wei-Ta Chu
Tsinghua University, Beijing, China
Peng Cui
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Jung-Woo Choi
National Tsing Hua University, Hsinchu, Taiwan
Min-Chun Hu
Ghent University, Ghent, Belgium
Wesley De Neve

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lokoč, J., Kovalčík, G., Souček, T. (2020). VIRET at Video Browser Showdown 2020. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_70

Download citation

DOI: https://doi.org/10.1007/978-3-030-37734-2_70
Published: 24 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37733-5
Online ISBN: 978-3-030-37734-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics