research-article

Evaluating Temporal Queries Over Video Feeds

Authors:
Yueting Chen

York University, Toronto, ON, Canada

York University, Toronto, ON, Canada
View Profile

,
Xiaohui Yu

York University, Toronto, ON, Canada

York University, Toronto, ON, Canada
View Profile

,
Nick Koudas

University of Toronto, Toronto, ON, Canada

University of Toronto, Toronto, ON, Canada
View Profile

,
Ziqiang Yu

Yantai University, Yantai, China

Yantai University, Yantai, China
View Profile

SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataJune 2021Pages 287–299https://doi.org/10.1145/3448016.3452803

Published:18 June 2021Publication History

SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data

Pages 287–299

ABSTRACT

Recent advances in Computer Vision and Deep Learning have made possible the efficient extraction of structured information from frames of video feeds. As such, a stream of objects and their associated classes along with unique object identifiers derived via object tracking can be generated, providing unique objects as they are captured across frames. In this paper we initiate a study of temporal queries involving objects and their co-occurrences in video feeds. For example, queries that identify video segments during which the same two red cars and the same two humans appear jointly for five minutes are of interest to many applications ranging from law enforcement to security and safety. We take the first step and define such queries in a way that they incorporate certain physical aspects of video capture such as object occlusion. We present an architecture consisting of three layers, namely object detection/tracking, intermediate data generation, and query evaluation. We propose two techniques, Marked Frame Set (MFS) and Sparse State Graph (SSG), to organize all detected objects in the intermediate data generation layer, which effectively, given the queries, minimizes the number of objects and frames that have to be considered during query evaluation. We also introduce an algorithm called SSG-CM that processes incoming frames against the SSG and efficiently prunes objects and frames unrelated to query evaluation, while maintaining all states required for succinct query evaluation. We present the results of a thorough experimental evaluation utilizing both real and synthetic data, establishing the trade-offs between MFS and SSG. We stress various parameters of interest in our evaluation and demonstrate that the proposed query evaluation methodology coupled with the proposed algorithms is capable to evaluate temporal queries over video feeds efficiently, achieving orders of magnitude performance benefits.

Supplemental Material

3448016.3452803.mp4

mp4

120.5 MB

Download

References

Favyen Bastani, Songtao He, Arjun Balasingam, Karthik Gopalakrishnan, Mohammad Alizadeh, Hari Balakrishnan, Michael Cafarella, Tim Kraska, and Sam Madden. 2020. MIRIS: Fast Object Track Queries in Video. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1907--1921.Google ScholarDigital Library
Yun Chi, Haixun Wang, Philip S Yu, and Richard R Muntz. 2004. Moment: Maintaining closed frequent itemsets over a stream sliding window. In Fourth IEEE International Conference on Data Mining (ICDM'04). IEEE, 59--66.Google Scholar
Nick Koudas Daren Chao and Ioannis Xarchakos. 2020. SVQGoogle Scholar
: Querying for Object Interactions in Video streams. In Proceedings of ACM SIGMOD, Demo Track .Google Scholar
Ross B. Girshick. 2015. Fast R-CNN. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7--13, 2015. 1440--1448. https://doi.org/10.1109/ICCV.2015.169Google ScholarDigital Library
Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23--28, 2014. 580--587. https://doi.org/10.1109/CVPR.2014.81Google ScholarDigital Library
Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning .MIT Press. http://www.deeplearningbook.org/Google ScholarDigital Library
Brandon Haynes, Maureen Daum, Amrita Mazumdar, Magdalena Balazinska, Alvin Cheung, and Luis Ceze. 2020. VisualWorldDB: A DBMS for the Visual World. In CIDR .Google Scholar
Brandon Haynes, Amrita Mazumdar, Magdalena Balazinska, Luis Ceze, and Alvin Cheung. 2019. Visual Road: A Video Data Management Benchmark. In Proceedings of the 2019 International Conference on Management of Data. ACM, 972--987.Google ScholarDigital Library
Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross B. Girshick. 2017. Mask R-CNN. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 2980--2988. https://doi.org/10.1109/ICCV.2017.322Google Scholar
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu. 2018. Focus: Querying Large Video Datasets with Low Latency and Low Cost. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 269--286. https://www.usenix.org/conference/osdi18/presentation/hsiehGoogle ScholarDigital Library
Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: Scalable Adaptation of Video Analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM '18). Association for Computing Machinery, New York, NY, USA, 253--266. https://doi.org/10.1145/3230543.3230574Google ScholarDigital Library
Nan Jiang and Le Gruenwald. 2006. CFI-Stream: mining closed frequent itemsets in data streams. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining . 592--597.Google ScholarDigital Library
D. Kang, P. Bailis, and M. Zaharia. 2019 a. BlazeIT: Fast Exploratory Video Queries Using Neural Networks. In PVLDB .Google Scholar
Daniel Kang, Peter Bailis, and Matei Zaharia. 2019 b. Challenges and Opportunities in DNN-Based Video Analytics: A Demonstration of the BlazeIt Video Query Engine. In CIDR 2019, 9th Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings . http://cidrdb.org/cidr2019/papers/p141-kang-cidr19.pdfGoogle Scholar
Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. Proc. VLDB Endow. , Vol. 10, 11 (Aug. 2017), 1586--1597. https://doi.org/10.14778/3137628.3137664Google ScholarDigital Library
Nick Koudas, Raymond Li, and Ioannis Xarchakos. 2020. Video Monitoring Queries. In Proceedings of IEEE ICDE .Google ScholarCross Ref
Sebastian Krebs, Bharanidhar Duraisamy, and Fabian Flohr. 2017. A survey on leveraging deep neural networks for object tracking. In 20th IEEE International Conference on Intelligent Transportation Systems, ITSC 2017, Yokohama, Japan, October 16--19, 2017. 411--418. https://doi.org/10.1109/ITSC.2017.8317904Google ScholarDigital Library
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM , Vol. 60, 6 (May 2017), 84--90. https://doi.org/10.1145/3065386Google ScholarDigital Library
Yann LeCun, Yoshua Bengio, and Geoffrey E. Hinton. 2015. Deep learning. Nature , Vol. 521, 7553 (2015), 436--444. https://doi.org/10.1038/nature14539Google Scholar
Yao Lu, Aakanksha Chowdhery, Srikanth Kandula, and Surajit Chaudhuri. 2018. Accelerating Machine Learning Inference with Probabilistic Predicates. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). ACM, New York, NY, USA, 1493--1508. https://doi.org/10.1145/3183713.3183751Google ScholarDigital Library
Siwei Lyu, Ming-Ching Chang, Dawei Du, Longyin Wen, Honggang Qi, Yuezun Li, Yi Wei, Lipeng Ke, Tao Hu, Marco Del Coco, et almbox. 2017. UA-DETRAC 2017: Report of AVSS2017 & IWT4S Challenge on Advanced Traffic Monitoring. In Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on. IEEE, 1--7.Google ScholarCross Ref
Xue Mei, Haibin Ling, Yi Wu, Erik Blasch, and Li Bai. 2011. Minimum error bounded efficient ? 1 tracker with occlusion detection. In CVPR 2011. IEEE, 1257--1264.Google Scholar
Anton Milan, Laura Leal-Taixé, Ian Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016).Google Scholar
Fatemeh Nori, Mahmood Deypir, and Mohamad Hadi Sadreddini. 2013. A sliding window based algorithm for frequent closed itemset mining over data streams. Journal of Systems and Software , Vol. 86, 3 (2013), 615--623.Google ScholarDigital Library
Alex Poms, Will Crichton, Pat Hanrahan, and Kayvon Fatahalian. 2018. Scanner: Efficient Video Analysis at Scale. CoRR , Vol. abs/1805.07339 (2018). arxiv: 1805.07339 http://arxiv.org/abs/1805.07339Google ScholarDigital Library
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arXiv (2018).Google Scholar
Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 6 (2017), 1137--1149. https://doi.org/10.1109/TPAMI.2016.2577031Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR , Vol. abs/1409.1556 (2014). arxiv: 1409.1556 http://arxiv.org/abs/1409.1556Google Scholar
R. Urtasun. 2020. Self Driving Vehicle Technology. CVPR 2020, Tutorial (2020).Google Scholar
Steven Euijong Whang, Hector Garcia-Molina, Chad Brower, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, and Ramana Yerneni. 2009. Indexing boolean expressions. Proceedings of the VLDB Endowment , Vol. 2, 1 (2009), 37--48.Google ScholarDigital Library
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple Online and Realtime Tracking with a Deep Association Metric. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 3645--3649. https://doi.org/10.1109/ICIP.2017.8296962Google ScholarDigital Library
Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. 2013. Online object tracking: A benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2411--2418.Google ScholarDigital Library
Ioannis Xarchakos and Nick Koudas. 2019. SVQ: Streaming Video Queries. In Proceedings of ACM SIGMOD, Demo Track .Google ScholarDigital Library
Tiantu Xu, Luis Materon Botelho, and Felix Xiaozhu Lin. 2018. Reinventing Data Stores for Video Analytics. CoRR , Vol. abs/1810.01794 (2018). arxiv: 1810.01794 http://arxiv.org/abs/1810.01794Google Scholar
Xingyi Zhou, Dequan Wang, and Philipp Kr"ahenbühl. 2019. Objects as points. arXiv preprint arXiv:1904.07850 (2019).Google Scholar

Index Terms

Evaluating Temporal Queries Over Video Feeds
1. Information systems
  1. Data management systems

Recommendations

TQVS: Temporal Queries over Video Streams in Action
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

We present TQVS, a system capable of conducting efficient evaluation of declarative temporal queries over real-time video streams. Users may issue queries to identify video clips in which the same two cars and the same three persons appear jointly in ...
Read More
SVQ++: Querying for Object Interactions in Video Streams
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

Deep neural nets enabled sophisticated information extraction out of images, including video frames. Recently, there has been interest in techniques and algorithms to enable interactive declarative query processing of objects appearing on video frames ...
Read More
Computing Complex Temporal Join Queries Efficiently
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

This paper studies multi-way join queries over temporal data, where each tuple is associated with a valid time interval indicating when the tuple is valid. A temporal join requires that joining tuples' valid intervals intersect. Previous work on ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
June 2021
2969 pages
ISBN:9781450383431
DOI:10.1145/3448016
General Chairs:
Guoliang Li
Tsinghua University (China)
,
Zhanhuai Li
Northwestern Polytechnical University (China)
,
Program Chairs:
Stratos Idreos
Harvard University (USA)
,
Divesh Srivastava
AT&T (USA)
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 June 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data management
temporal queries
video queries
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 363
  Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evaluating Temporal Queries Over Video Feeds

SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

TQVS: Temporal Queries over Video Streams in Action

SVQ++: Querying for Object Interactions in Video Streams

Computing Complex Temporal Join Queries Efficiently