skip to main content
10.1145/3448016.3452803acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Evaluating Temporal Queries Over Video Feeds

Published:18 June 2021Publication History

ABSTRACT

Recent advances in Computer Vision and Deep Learning have made possible the efficient extraction of structured information from frames of video feeds. As such, a stream of objects and their associated classes along with unique object identifiers derived via object tracking can be generated, providing unique objects as they are captured across frames. In this paper we initiate a study of temporal queries involving objects and their co-occurrences in video feeds. For example, queries that identify video segments during which the same two red cars and the same two humans appear jointly for five minutes are of interest to many applications ranging from law enforcement to security and safety. We take the first step and define such queries in a way that they incorporate certain physical aspects of video capture such as object occlusion. We present an architecture consisting of three layers, namely object detection/tracking, intermediate data generation, and query evaluation. We propose two techniques, Marked Frame Set (MFS) and Sparse State Graph (SSG), to organize all detected objects in the intermediate data generation layer, which effectively, given the queries, minimizes the number of objects and frames that have to be considered during query evaluation. We also introduce an algorithm called SSG-CM that processes incoming frames against the SSG and efficiently prunes objects and frames unrelated to query evaluation, while maintaining all states required for succinct query evaluation. We present the results of a thorough experimental evaluation utilizing both real and synthetic data, establishing the trade-offs between MFS and SSG. We stress various parameters of interest in our evaluation and demonstrate that the proposed query evaluation methodology coupled with the proposed algorithms is capable to evaluate temporal queries over video feeds efficiently, achieving orders of magnitude performance benefits.

Skip Supplemental Material Section

Supplemental Material

3448016.3452803.mp4

mp4

120.5 MB

References

  1. Favyen Bastani, Songtao He, Arjun Balasingam, Karthik Gopalakrishnan, Mohammad Alizadeh, Hari Balakrishnan, Michael Cafarella, Tim Kraska, and Sam Madden. 2020. MIRIS: Fast Object Track Queries in Video. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1907--1921.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yun Chi, Haixun Wang, Philip S Yu, and Richard R Muntz. 2004. Moment: Maintaining closed frequent itemsets over a stream sliding window. In Fourth IEEE International Conference on Data Mining (ICDM'04). IEEE, 59--66.Google ScholarGoogle Scholar
  3. Nick Koudas Daren Chao and Ioannis Xarchakos. 2020. SVQGoogle ScholarGoogle Scholar
  4. : Querying for Object Interactions in Video streams. In Proceedings of ACM SIGMOD, Demo Track .Google ScholarGoogle Scholar
  5. Ross B. Girshick. 2015. Fast R-CNN. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7--13, 2015. 1440--1448. https://doi.org/10.1109/ICCV.2015.169Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23--28, 2014. 580--587. https://doi.org/10.1109/CVPR.2014.81Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning .MIT Press. http://www.deeplearningbook.org/Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Brandon Haynes, Maureen Daum, Amrita Mazumdar, Magdalena Balazinska, Alvin Cheung, and Luis Ceze. 2020. VisualWorldDB: A DBMS for the Visual World. In CIDR .Google ScholarGoogle Scholar
  9. Brandon Haynes, Amrita Mazumdar, Magdalena Balazinska, Luis Ceze, and Alvin Cheung. 2019. Visual Road: A Video Data Management Benchmark. In Proceedings of the 2019 International Conference on Management of Data. ACM, 972--987.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross B. Girshick. 2017. Mask R-CNN. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 2980--2988. https://doi.org/10.1109/ICCV.2017.322Google ScholarGoogle Scholar
  11. Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google ScholarGoogle Scholar
  12. Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu. 2018. Focus: Querying Large Video Datasets with Low Latency and Low Cost. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 269--286. https://www.usenix.org/conference/osdi18/presentation/hsiehGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  13. Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: Scalable Adaptation of Video Analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM '18). Association for Computing Machinery, New York, NY, USA, 253--266. https://doi.org/10.1145/3230543.3230574Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nan Jiang and Le Gruenwald. 2006. CFI-Stream: mining closed frequent itemsets in data streams. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining . 592--597.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Kang, P. Bailis, and M. Zaharia. 2019 a. BlazeIT: Fast Exploratory Video Queries Using Neural Networks. In PVLDB .Google ScholarGoogle Scholar
  16. Daniel Kang, Peter Bailis, and Matei Zaharia. 2019 b. Challenges and Opportunities in DNN-Based Video Analytics: A Demonstration of the BlazeIt Video Query Engine. In CIDR 2019, 9th Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings . http://cidrdb.org/cidr2019/papers/p141-kang-cidr19.pdfGoogle ScholarGoogle Scholar
  17. Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. Proc. VLDB Endow. , Vol. 10, 11 (Aug. 2017), 1586--1597. https://doi.org/10.14778/3137628.3137664Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nick Koudas, Raymond Li, and Ioannis Xarchakos. 2020. Video Monitoring Queries. In Proceedings of IEEE ICDE .Google ScholarGoogle ScholarCross RefCross Ref
  19. Sebastian Krebs, Bharanidhar Duraisamy, and Fabian Flohr. 2017. A survey on leveraging deep neural networks for object tracking. In 20th IEEE International Conference on Intelligent Transportation Systems, ITSC 2017, Yokohama, Japan, October 16--19, 2017. 411--418. https://doi.org/10.1109/ITSC.2017.8317904Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM , Vol. 60, 6 (May 2017), 84--90. https://doi.org/10.1145/3065386Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yann LeCun, Yoshua Bengio, and Geoffrey E. Hinton. 2015. Deep learning. Nature , Vol. 521, 7553 (2015), 436--444. https://doi.org/10.1038/nature14539Google ScholarGoogle Scholar
  22. Yao Lu, Aakanksha Chowdhery, Srikanth Kandula, and Surajit Chaudhuri. 2018. Accelerating Machine Learning Inference with Probabilistic Predicates. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). ACM, New York, NY, USA, 1493--1508. https://doi.org/10.1145/3183713.3183751Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Siwei Lyu, Ming-Ching Chang, Dawei Du, Longyin Wen, Honggang Qi, Yuezun Li, Yi Wei, Lipeng Ke, Tao Hu, Marco Del Coco, et almbox. 2017. UA-DETRAC 2017: Report of AVSS2017 & IWT4S Challenge on Advanced Traffic Monitoring. In Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on. IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  24. Xue Mei, Haibin Ling, Yi Wu, Erik Blasch, and Li Bai. 2011. Minimum error bounded efficient ? 1 tracker with occlusion detection. In CVPR 2011. IEEE, 1257--1264.Google ScholarGoogle Scholar
  25. Anton Milan, Laura Leal-Taixé, Ian Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016).Google ScholarGoogle Scholar
  26. Fatemeh Nori, Mahmood Deypir, and Mohamad Hadi Sadreddini. 2013. A sliding window based algorithm for frequent closed itemset mining over data streams. Journal of Systems and Software , Vol. 86, 3 (2013), 615--623.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alex Poms, Will Crichton, Pat Hanrahan, and Kayvon Fatahalian. 2018. Scanner: Efficient Video Analysis at Scale. CoRR , Vol. abs/1805.07339 (2018). arxiv: 1805.07339 http://arxiv.org/abs/1805.07339Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arXiv (2018).Google ScholarGoogle Scholar
  29. Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 6 (2017), 1137--1149. https://doi.org/10.1109/TPAMI.2016.2577031Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR , Vol. abs/1409.1556 (2014). arxiv: 1409.1556 http://arxiv.org/abs/1409.1556Google ScholarGoogle Scholar
  31. R. Urtasun. 2020. Self Driving Vehicle Technology. CVPR 2020, Tutorial (2020).Google ScholarGoogle Scholar
  32. Steven Euijong Whang, Hector Garcia-Molina, Chad Brower, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, and Ramana Yerneni. 2009. Indexing boolean expressions. Proceedings of the VLDB Endowment , Vol. 2, 1 (2009), 37--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple Online and Realtime Tracking with a Deep Association Metric. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 3645--3649. https://doi.org/10.1109/ICIP.2017.8296962Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. 2013. Online object tracking: A benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2411--2418.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ioannis Xarchakos and Nick Koudas. 2019. SVQ: Streaming Video Queries. In Proceedings of ACM SIGMOD, Demo Track .Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tiantu Xu, Luis Materon Botelho, and Felix Xiaozhu Lin. 2018. Reinventing Data Stores for Video Analytics. CoRR , Vol. abs/1810.01794 (2018). arxiv: 1810.01794 http://arxiv.org/abs/1810.01794Google ScholarGoogle Scholar
  37. Xingyi Zhou, Dequan Wang, and Philipp Kr"ahenbühl. 2019. Objects as points. arXiv preprint arXiv:1904.07850 (2019).Google ScholarGoogle Scholar

Index Terms

  1. Evaluating Temporal Queries Over Video Feeds

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
      June 2021
      2969 pages
      ISBN:9781450383431
      DOI:10.1145/3448016

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 June 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader