Article

A 3-dimensional sift descriptor and its application to action recognition

Authors:
Paul Scovanner

University of Central Florida, Orlando

University of Central Florida, Orlando
View Profile

,
Saad Ali

University of Central Florida, Orlando

University of Central Florida, Orlando
View Profile

,
Mubarak Shah

University of Central Florida, Orlando

University of Central Florida, Orlando
View Profile

MM '07: Proceedings of the 15th ACM international conference on MultimediaSeptember 2007Pages 357–360https://doi.org/10.1145/1291233.1291311

Published:29 September 2007Publication History

MM '07: Proceedings of the 15th ACM international conference on Multimedia

Pages 357–360

ABSTRACT

In this paper we introduce a 3-dimensional (3D) SIFT descriptor for video or 3D imagery such as MRI data. We also show how this new descriptor is able to better represent the 3D nature of video data in the application of action recognition. This paper will show how 3D SIFT is able to outperform previously used description methods in an elegant and efficient manner. We use a bag of words approach to represent videos, and present a method to discover relationships between spatio-temporal words in order to better describe the video data.

References

M. Blank et al., "Actions as Space-Time Shapes," ICCV, 2005. Google ScholarDigital Library
M. Brown et al., "Recognising Panoramas," ICCV, 2003. Google ScholarDigital Library
G. Csurka et al., "Visual Categorization with Bags of Keypoints", ECCV, 2004.Google Scholar
A. Efros et al., "Recognizing Action at a Distance," ICCV, 2003. Google ScholarDigital Library
Y. Ke et al., "Efficient Visual Event Detection using Volumetric Features", ICCV, 2005. Google ScholarDigital Library
M. Lalonde et al., "Real-Time eye blink detection with GPU-based SIFT tracking," Fourth Canadian Conference on Computer and Robot Vision, 2007. Google ScholarDigital Library
S. Lazebnik et al., "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," CVPR, 2005. Google ScholarDigital Library
D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," IJCV, 2004. Google ScholarDigital Library
D. G. Lowe, "Object recognition from local scale-invariant features," ICCV, 1999. Google ScholarDigital Library
J. C. Niebles et al., "Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words," BMVC, 2006.Google Scholar
S. Se et al., "Vision-based Mobile Robot Localization And Mapping using Scale-Invariant Features," International Journal of Robotics Research, 2002.Google Scholar
A. Yilmaz et al., "Actions Sketch: A Novel Action Representation," CVPR, 2005. Google ScholarDigital Library

Index Terms

A 3-dimensional sift descriptor and its application to action recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding

Recommendations

Improvements to the Descriptor of SIFT by BOF Approaches
ACPR '13: Proceedings of the 2013 2nd IAPR Asian Conference on Pattern Recognition

The efficacy and efficiency of SIFT have made it a state-of-art feature descriptor. It has been widely used in many computer vision applications such as image classification. A large number of methods, e.g. PCA-SIFT, have been contributed to further ...
Read More
Action recognition using 3D DAISY descriptor

In this paper we propose a novel spatial-temporal descriptor for action recognition. We extend a recent image local descriptor, DAISY, to three dimensions to deal with the information in the additional temporal domain in videos. The new 3D DAISY ...
Read More
A new framework for feature descriptor based on SIFT

The description of interest points is a critical aspect of point correspondence which is vital in some computer vision and pattern recognition tasks. SIFT descriptor has been proven to perform better on the distinctiveness and robustness than other ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '07: Proceedings of the 15th ACM international conference on Multimedia
September 2007
1115 pages
ISBN:9781595937025
DOI:10.1145/1291233
General Chairs:
Rainer Lienhart
University of Augsburg, Germany
,
Anand R. Prasad
DoCoMo Euro-Labs,Germany
,
Program Chairs:
Alan Hanjalic
Delft University of Technology, The Netherlands
,
Sunghyun Choi
Seoul National University, South Korea
,
Brian Bailey
University of Illinois at Urbana-Champaign
,
Nicu Sebe
University of Amsterdam, The Netherlands
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 September 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1,118
  Total Citations
  View Citations
- 4,283
  Total Downloads
- Downloads (Last 12 months)66
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A 3-dimensional sift descriptor and its application to action recognition

MM '07: Proceedings of the 15th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improvements to the Descriptor of SIFT by BOF Approaches

Action recognition using 3D DAISY descriptor

A new framework for feature descriptor based on SIFT