p.2716
p.2720
p.2724
p.2729
p.2733
p.2737
p.2742
p.2746
p.2751
The Design of Distributed File System Based on HDFS
Abstract:
HDFS is a distributed file system designed to access large files, which is inefficient for storing small files. For this issue, a new storage architecture based on the HDFS is designed to solve the problem of low efficiency of HDFS storing small files in this article. This paper mainly uses SequenceFile to merge small files and against to the shortcoming that SequenceFile merges small files, the paper provides the solution and designs a new system structure based on HDFS. The system mainly increases the file judgment unit to mark and identify small files, creates a local index file which is helpful to improve the retrieval efficiency of small files to record the size and offset of the small files and finally uses binary serialization to merge the small files, which makes small files be written into large files as time order.
Info:
Periodical:
Pages:
2733-2736
Citation:
Online since:
September 2013
Authors:
Keywords:
Price:
Permissions: