The Design of Distributed File System Based on HDFS

Article Preview

Abstract:

HDFS is a distributed file system designed to access large files, which is inefficient for storing small files. For this issue, a new storage architecture based on the HDFS is designed to solve the problem of low efficiency of HDFS storing small files in this article. This paper mainly uses SequenceFile to merge small files and against to the shortcoming that SequenceFile merges small files, the paper provides the solution and designs a new system structure based on HDFS. The system mainly increases the file judgment unit to mark and identify small files, creates a local index file which is helpful to improve the retrieval efficiency of small files to record the size and offset of the small files and finally uses binary serialization to merge the small files, which makes small files be written into large files as time order.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2733-2736

Citation:

Online since:

September 2013

Export:

Price:

* - Corresponding Author

[1] Jianguang Deng, Xiaoheng Pan, Huaqiang Yuan, Research of Cloud storage and its Distributed File System, Journal of Dong Guan University of Technology,. vol. 19, no. 7, pp.41-45, (2012).

Google Scholar

[2] Weijiao Hao, Shijian Zhou, Dawei Peng, Research of the Cloud GIS Frame with Hadoop Cloud Platform, Jiangxi Science, vol. 31, no. 1, pp.109-112, (2013).

Google Scholar

[3] Dongxue Qin, Study on Processing of Massive Small Files Based on Hadoop, Liaoning University, China, (2011).

Google Scholar

[4] Chunling Xu, Guangquan Zhang, Comparison and analysis of distributed file system Hadoop HDFS with traditional file system Linux FS, Journal of SuZhou University, vol. 30, no. 4, pp.5-9, (2012).

Google Scholar

[5] Guangyao Zhu, The Hadoop mass processing and analysis of small files, Science and Technology Information, 2012. 28.

Google Scholar

[6] Yannan Wang, Hui Liu, Shudong Zhang, Research of Processing Massive Small Files Based on Hadoop, Journal of Convergence Information Technology, vol. 8, no. 9, pp.130-137, (2013).

Google Scholar

[7] http: /heipark. iteye. com/blog/1356063.

Google Scholar

[8] http: /blog. csdn. net/flyingpig4/article/details/7579658.

Google Scholar

[9] Xusheng Hong, Shiping Lin, Efficiency of Storaging Small Files in HDFS Based on MapFile, Computer Systems & Applications, vol. 21, no. 11, pp.179-182, (2013).

Google Scholar