ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Digital Investigation
Volume 3, Issue 3, September 2006, Pages 138-150
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (1264 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.diin.2006.08.010    How to Cite or Link Using DOI (Opens New Window)
Published by Elsevier Ltd.

Unique file identification in the National Software Reference Library

Steve Meada, E-mail The Corresponding Author

aNational Institute of Standards & Technology, 100 Bureau Drive, Stop 8970, Gaithersburg, MD 20899, United States

Received 22 June 2006; 
accepted 4 August 2006. 
Available online 24 October 2006.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The National Software Reference Library (NSRL) provides a repository of known software, file profiles, and file signatures for use by law enforcement and other organizations involved with computer forensic investigations.

During a forensic investigation, hundreds of thousands of files may be encountered. The NSRL is used to identify known files. This can reduce the amount of time spent examining a computer. Matches for common operating systems and applications do not need to be searched, either manually or electronically, for evidence. Additionally, the NSRL is used to determine which software applications are present on a system. This may suggest how the computer was being used and provide information on how and where to search for evidence.

This paper examines whether the techniques used to create file signatures in the NSRL produce unique results—a core characteristic that the NSRL depends on for the majority of its uses. The uniqueness of the file identification is analyzed via two methods: an empirical analysis of the file signatures within the NSRL and research into the recent attacks on the hash algorithms used to generate the file signatures within the NSRL.

Keywords: Computer forensics; National Software Reference Library; File hashing; MD5; SHA-1; Digital fingerprint; Collisions; File signature

Article Outline

1. Introduction
2. NSRL file signatures
3. Uniqueness
4. Approach
4.1. Examining the NSRL for collisions
4.2. Likelihood for future collisions
4.3. The impact of the recent attacks on MD5 and SHA-1
5. Results: collisions in the NSRL
5.1. Collisions detected
5.2. Results: distribution of hash values
5.3. Analysis
6. Future collisions—file bias on signature generation
6.1. Statistical test suite overview
6.1.1. Statistical hypothesis testing
6.1.2. Significance level, α
6.1.3. Probability value (P-value)
6.1.4. STS tests
6.1.5. Methodology (STS)
6.2. Results summary
6.2.1. Results: SHA-1 file signatures
6.2.2. Results: MD5 file signatures
6.2.3. Results: CRC-32 file signatures
7. Results: future collisions—detecting file bias through data visualization
7.1. Methodology
7.2. Analysis summary
7.2.1. Distribution: visual analysis of SHA-1
7.2.2. Distribution: visual analysis of MD5
7.2.3. Distribution: visual analysis of CRC-32
8. Implications of MD5 and SHA-1 attacks on forensic use of hashes for file identification
9. Conclusions
9.1. Implications of recent MD5 and SHA-1 attacks on the NSRL
Acknowledgements
References
Vitae




Digital Investigation
Volume 3, Issue 3, September 2006, Pages 138-150
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.