ABSTRACT
Mapping packet processing applications onto embedded network processors (NP) is a challenging task due to the unique constraints of NP systems and the characteristics of network application domains. A remarkable difference with general multiprocessor task scheduling is that NPs are often programmed into a hybrid parallel and pipeline topology.
In this paper, we introduce a multilevel balancing and refining algorithm for NP program mapping. We use a divide-and-conquer approach to recursively bipartition the task graph into disjoint subdomains. At each level of bipartition, the processing resources will be co-allocated so that an estimation of throughput can be derived. The bipartition continues until the code of the tasks can be fit into the instruction memory of processing elements. Then the algorithm iteratively refines the solution by migrating tasks from the bottleneck stage to other stages. The performance of our scheme is evaluated with a suite of NP benchmarks using SUIF/Machine SUIF compiler and Intel IXA Architecture Tool. The throughput improvement is significant: average throughput is increased by 20%, and the maximum is 108%.
- Intel IXP2XXX Product Line of Network Processors, Intel Corporation.Google Scholar
- J. Yao, Y. Luo, L. Bhuyan and R. Iyer "Optimal Network Processor Topologies for Efficient Packet Processing," IEEE Globecom, 2005.Google Scholar
- M. I. Gordon, W. Thies, and S. Amarasinghe "Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs," ASPLOS, 2006 Google ScholarDigital Library
- J. Dai, B. Huang, L. Li and L. Harrison, "Automatically Partitioning Packet Processing Applications for Pipelined Architectures," PLDI '05, pp. 237--248, 2005. Google ScholarDigital Library
- M. K. Chen, X. F. Li, R. Lian, J. H. Lin, L. Liu, T. Liu and R. Ju "Shangri-La: Achieving High Performance from Compiled Network Applications while Enabling Ease of Programming," PLDI '05, pp. 224--236, 2005. Google ScholarDigital Library
- Steven S. Muchnick, "Advanced compiler design and implementation," Morgan Kaufmann Publishers Inc., 1997 Google ScholarDigital Library
- N. Weng and T. Wolf "Pipelining vs. Multiprocessors --- Choosing the Right Network Processor System Topology," ANCHOR in conjunction with ISCA 2004.Google Scholar
- SUIF Compiler System, Stanford University.Google Scholar
- Machine-SUIF, Harvard University.Google Scholar
- F. Ercal, J. Ramanujam and P. Sadayappan, "Task Allocation onto a Hypercube by Recursive Mincut Bipartitioning", in Journal of Parallel and Distributed Computing pp.35--44, Vol. 10, No. 1, 1990. Google ScholarDigital Library
- H. H. Yang and D. F. Wong, "Efficient Network Flow Based Min-cut Balanced Partitioning" in Proc. of the 1994 IEEE/ACM international conference on Computer-aided design pp.50--55, 1994. Google ScholarDigital Library
- A. V. Goldberg and R. E. Tarjan, "A New Approach to the Maximum Flow Problem" in Proc. 18th ACM STO, pp. 136--146, 1986. Google ScholarDigital Library
- B. W. Kernighan and S. Lin, "An efficient Heuristic Procedure for Partitioning Graphs," Bell Syst. Tech. J., pp. 291--308, Vol. 49, No. 2, 1970.Google Scholar
- R. Ramaswamy and T. Wolf. "PacketBench: A Tool for Workload Characterization of Network Processing," WWC-6, pp.42--50, 2003.Google Scholar
- G. Memik, W. H. Mangione-Smith and W. D. Hu "NetBench: A Benchmarking Suite for Network Processor," ICCAD, pp.39-, 2001. Google ScholarDigital Library
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein, "Introduction to Algorithms," MIT Press, 2001. Google ScholarDigital Library
Index Terms
- Program mapping onto network processors by recursive bipartitioning and refining
Recommendations
Evaluating Network Processors using NetBench
The Network Processor market is one of the fastest growing segments of the microprocessor industry today. In spite of this increasing market importance, there does not exist a common framework to compare the performance of different Network Processor ...
Effective thread management on network processors with compiler analysis
Proceedings of the 2006 LCTES ConferenceMapping packet processing tasks on network processor micro-engines involves complex tradeoffs that relating to maximizing parallelism and pipelining. Due to an increase in the size of the code store and complexity of the application requirements, ...
C Compiler Design for an Industrial Network Processor
OM '01: Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systemsOne important problem in code generation for embedded processors is the design of efficient compilers for ASIPs with application specific architectures. This paper outlines the design of a C compiler for an industrial ASIP for telecom applications. The ...
Comments