ABSTRACT
A recursive and fast construction of an n elements priority queue from exponentially smaller hardware priority queues and size n RAM is presented. All priority queue implementations to date either require O (log n) instructions per operation or exponential (with key size) space or expensive special hardware whose cost and latency dramatically increases with the priority queue size. Hence constructing a priority queue (PQ) from considerably smaller hardware priority queues (which are also much faster) while maintaining the O(1) steps per PQ operation is critical. Here we present such an acceleration technique called the Power Priority Queue (PPQ) technique. Specifically, an n elements PPQ is constructed from 2k-1 primitive priority queues of size k√n (k=2,3,...) and a RAM of size n, where the throughput of the construct beats that of a single, size n primitive hardware priority queue. For example an n elements PQ can be constructed from either three √n or five 3√n primitive H/W priority queues.
Applying our technique to a TCAM based priority queue, results in TCAM-PPQ, a scalable perfect line rate fair queuing of millions of concurrent connections at speeds of 100 Gbps. This demonstrates the benefits of our scheme when used with hardware TCAM, we expect similar results with systolic arrays, shift-registers and similar technologies.
As a by product of our technique we present an O(n) time sorting algorithm in a system equipped with a O(w√n) entries TCAM, where here n is the number of items, and w is the maximum number of bits required to represent an item, improving on a previous result that used an Ω(n) entries TCAM. Finally, we provide a lower bound on the time complexity of sorting n elements with TCAM of size O(n) that matches our TCAM based sorting algorithm.
- M. Thorup, "Equivalence between priority queues and sorting," in IEEE Symposium on Foundations of Computer Science, 2002, pp. 125--134. Google ScholarDigital Library
- P. Lavoie, D. Haccoun, and Y. Savaria, "A systolic architecture for fast stack sequential decoders," Communications, IEEE Transactions on, vol. 42, no. 234, pp. 324--335, feb/mar/apr 1994.Google ScholarCross Ref
- S.-W. Moon, K. Shin, and J. Rexford, "Scalable hardware priority queue architectures for high-speed packet switches," in Real-Time Technology and Applications Symposium, 1997. Proceedings., Third IEEE, jun 1997, pp. 203--212. Google ScholarDigital Library
- H. Wang and B. Lin, "Pipelined van emde boas tree: Algorithms, analysis, and applications," in IEEE INFOCOM, 2007, pp. 2471--2475.Google ScholarDigital Library
- K. Mclaughlin, S. Sezer, H. Blume, X. Yang, F. Kupzog, and T. G. Noll, "A scalable packet sorting circuit for high-speed wfq packet scheduling," IEEE Transactions on Very Large Scale Integration Systems, vol. 16, pp. 781--791, 2008. Google ScholarDigital Library
- A. Ioannou and M. Katevenis, "Pipelined heap (priority queue) management for advanced scheduling in high-speed networks," Networking, IEEE/ACM Transactions on, vol. 15, no. 2, pp. 450--461, april 2007. Google ScholarDigital Library
- R. Chandra and O. Sinnen, "Improving application performance with hardware data structures," in Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, april 2010, pp. 1--4.Google Scholar
- R. Panigrahy and S. Sharma, "Sorting and searching using ternary cams," IEEE Micro, vol. 23, pp. 44--53, January 2003. Google ScholarDigital Library
- Y. Afek, A. Bremler-Barr, and L. Schiff, "Recursive design of hardware priority queues." {Online}. Available: http://www.cs.tau.ac.il/~schiffli/PPQfull.pdfGoogle Scholar
- L. Zhang, "Virtualclock: a new traffic control algorithm for packet-switched networks," ACM Transactions on Computer Systems (TOCS), vol. 9, no. 2, pp. 101--124, may 1991. Google ScholarDigital Library
- P. Goyal, H. Vin, and H. Cheng, "Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks," Networking, IEEE/ACM Transactions on, vol. 5, no. 5, pp. 690--704, oct 1997. Google ScholarDigital Library
- S. Keshav, An engineering approach to computer networking: ATM networks, the Internet, and the telephone network. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1997. Google ScholarDigital Library
- A. Kortebi, L. Muscariello, S. Oueslati, and J. Roberts, "Evaluating the number of active flows in a scheduler realizing fair statistical bandwidth sharing," in Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, ser. SIGMETRICS '05. New York, NY, USA: ACM, 2005, pp. 217--228. {Online}. Available: http://doi.acm.org/10.1145/1064212.1064237 Google ScholarDigital Library
- M. Shreedhar and G. Varghese, "Efficient fair queueing using deficit round-robin," IEEE/ACM Trans. Netw., vol. 4, pp. 375--385, June 1996. {Online}. Available: http://dx.doi.org/10.1109/90.502236 Google ScholarDigital Library
- H. Wang and B. Lin, "Succinct priority indexing structures for the management of large priority queues," in Quality of Service, 2009. IWQoS. 17th International Workshop on, july 2009, pp. 1--5.Google Scholar
- X. Zhuang and S. Pande, "A scalable priority queue architecture for high speed network processing," in INFOCOM 2006. 25th IEEE International Conference on Computer Communications. Proceedings, april 2006, pp. 1--12.Google Scholar
- G. S. Brodal, J. L. TrÃd'ff, and C. D. Zaroliagis, "A parallel priority queue with constant time operations," Journal of Parallel and Distributed Computing, vol. 49, no. 1, pp. 4 --21, 1998. Google ScholarDigital Library
- A. V. Gerbessiotis and C. J. Siniolakis, "Architecture independent parallel selection with applications to parallel priority queues," Theoretical Computer Science, vol. 301, no. 1A S3, pp. 119--142, 2003. Google ScholarDigital Library
- J. Garcia, M. March, L. Cerda, J. Corbal, and M. Valero, "On the design of hybrid dram/sram memory schemes for fast packet buffers," in High Performance Switching and Routing, 2004. HPSR. 2004 Workshop on, 2004, pp. 15--19.Google Scholar
- H. J. Chao and B. Liu, High Performance Switches and Routers. John Wiley & Sons, Inc., 2006. Google ScholarDigital Library
- J. Patel, E. Norige, E. Torng, and A. X. Liu, "Fast regular expression matching using small tcams for network intrusion detection and prevention systems," in USENIX Security Symposium, 2010, pp. 111--126. Google ScholarDigital Library
- Packet size distribution comparison between Internet links in 1998 and 2008, CAIDA. {Online}. Available: http://www.caida.org/research/traffic-analysis/pkt_size_ distribution/graphs.xmlGoogle Scholar
- A. M. Ben-amram, "When can we sort in o(n log n) time"? Journal of Computer and System Sciences, vol. 54, pp. 345--370, 1997. Google ScholarDigital Library
- B. Agrawal and T. Sherwood, "Ternary cam power and delay model: Extensions and uses," IEEE Transactions on Very Large Scale Integration Systems, vol. 16, pp. 554--564, 2008. Google ScholarDigital Library
Index Terms
- Recursive design of hardware priority queues
Recommendations
Waiting time and queue length analysis of Markov-modulated fluid priority queues
AbstractThis paper considers a multi-type fluid queue with priority service. The input fluid rates are modulated by a Markov chain, which is common for all fluid types. The service rate of the queue is constant. Various performance measures are derived, ...
An Arrival Time Approach to M/G/1-type Queues with Generalized Vacations
We propose a simple way, called the arrival time approach, of finding the queue length distributions for M/G/1-type queues with generalized server vacations. The proposed approach serves as a useful alternative to understanding complicated queueing ...
(N,n)-preemptive priority queues
In this paper, we propose a new priority discipline, called the (N,n)-preemptive priority discipline. Under this discipline, the preemption of the service of a low-class customer is determined by two thresholds N and n of the queue length of high-class ...
Comments