ABSTRACT
The high increase in usage of XML in electronic data exchange introduces new challenges for efficient processing of XML data. Applications that heavily use XML need to be able to quickly extract the relevant parts of the XML data, often using the XPath language for addressing XML document parts. High speed execution of XPath requests and queries is therefore becoming a critical requirement in many application domains, including XML databases and event processing. This work explores the potential for accelerating XPath processing in these domains using specialized hardware. This in turn poses the challenges of integrating specialized hardware with general-purpose application code. We present the design decisions behind building an integration layer to bridge between applications and the hardware, and describe our implementation. We discuss the factors that affect the acceleration potential, and show that despite the transmission overheads associated with off-loading XPath processing to the specialized co-processor, significant speedups can be obtained, ranging from modest 11% improvements in the event-processing domain, to over 6x speedup factor in the healthcare domain.
- I. Avila-Campillo, D. Raven, T. Green, A. Gupta, Y. Kadiyska, M. Onizuka, and D. Suciu. An XML Toolkit for Light-weight XML Stream Processing, 2002. http://www.cs.washington.edu/homes/suciu/XMLTK/.Google Scholar
- R. Bordawekar, L. Lim, A. Kementsietsidis, and B. Wei-Lun Kok. Statistics-based parallelization of XPath queries in shared memory systems. In EDBT, pages 159--170, 2010. Google ScholarDigital Library
- R. Bordawekar, L. Lim, and O. Shmueli. Parallelization of XPath queries using multi-core processors: challenges and experiences. In EDBT, pages 180--191, 2009. Google ScholarDigital Library
- T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Extensible Markup Language (XML) 1.0 (Fifth Edition). World Wide Web Consortium, November 26 1998. http://www.w3.org/TR/xml/.Google Scholar
- R. Cameron, K. Herdy, and E. Amiri. Parallel bit stream technology as a foundation for XML parsing performance. In International Symposium on Processing XML Efficiently, August 10 2009.Google Scholar
- J. Clark. XSL Transformations (XSLT) Version 1.0. World Wide Web Consortium, November 16 1999. http://www.w3.org/TR/xslt.Google Scholar
- J. Clark and S. DeRose. XML Path Language (XPath) Version 1.0. World Wide Web Consortium, November 16 1999. http://www.w3.org/TR/xpath/.Google Scholar
- Dajeil Ltd. Dajeil DXP XML/Web Services Acceleration Platform. http://www.dajeil.com/.Google Scholar
- C. Foster. XML databases -- the business case. http://www.cfoster.net/articles/xmldb-business-case/, 2008.Google Scholar
- H. Franke, T. Nelms, H. Yu, H. D. Achilles, and R. Salz. Exploiting heterogenous multicore-processor systems for high-performance network processing. In IBM Journal of Research and Development, Volume 54, Number 1, 2010. Google ScholarDigital Library
- H. Franke, J. Xenidis, C. Basso, B. N. Bass, S. S. Woodward, J. D. Brown, and C. L. Johnson. Introduction to the wire-speed processor and architecture. In IBM Journal of Research and Development, Volume 54, Number 1, 2010. Google ScholarDigital Library
- T. Freund and P. Niblett. ESB Interoperability Standards. IBM Corporation, 2008. http://download.boulder.ibm.com/ibmdl/pub/software/dw/specs/ws-esb-interop/ESB_Interop_Standards_WP_060208.pdf.Google Scholar
- Health Level Seven International. HL7. http://www.hl7.org/.Google Scholar
- IBM Corporation. IBM Developer Works, new to SOA and web services. http://www.ibm.com/developerworks/webservices/newto/.Google Scholar
- IBM Corporation. WebSphere Business Monitor. http://www.ibm.com/software/integration/wbimonitor.Google Scholar
- IBM Corporation. WebSphere DataPower Integration Appliance XI50. http://www.ibm.com/software/integration/datapower/xi50.Google Scholar
- IBM Corporation. Extensible Dynamic Binary XML, Client/Server Binary XML Format (XDBX) Version 1.0, July 14 2010. http://www-01.ibm.com/support/docview.wss?uid=swg27019354.Google Scholar
- K. Jittrawong and R. K. Wong. Optimizing XPath queries on streaming XML data. In 18th Australasian Database Conference (ADC), Ballarat, Victoria, Australia, 2007. Google ScholarDigital Library
- D. P. LaPotin, S. Daijavad, C. L. Johnson, S. W. Hunter, K. Ishizaki, H. Franke, H. D. Achilles, D. P. Dumarot, N. A. Greco, and B. Davari. Workload and network-optimized computing systems. In IBM Journal of Research and Development, Volume 54, Number 1, 2010. Google ScholarDigital Library
- Layer7 Technologies. http://www.layer7tech.com/.Google Scholar
- R. W. Linderman, C. S. Lin, and M. H. Linderman. FPGA acceleration of information management services. In HPEC, 2004.Google Scholar
- LSI Corporation. LSI Tatari XML Content Processors. http://www.lsi.com/networking_home/networking_products/tarari_content_processors/xml/.Google Scholar
- A. Marian and J. Siméon. Projecting XML documents. In Proceedings of the 29th VLDB Conference, Berlin, 2003. Google ScholarDigital Library
- R. Moussalli, M. Salloum, W. Kajjar, and V. Tsotras. Accelerating XML query matching through custom stack generation on FPGAs. In International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC), Pisa, Italy, January 25--27 2010. Google ScholarDigital Library
- R. Murthy et al. Towards an enterprise XML architecture. In SIGMOD, 2005. Google ScholarDigital Library
- M. Nicola. Lessons learned from DB2 pureXML applications -- a practitioner's perspective. In 7th International Database Symposium XSYM, 2010. Google ScholarDigital Library
- M. Nicola and V. Rodrigues. A performance comparison of DB2 9 pureXML and CLOB to shredded XML storage. IBM DeveloperWorks, December 2006. http://www.ibm.com/developerworks/data/library/techarticle/dm-0612nicola/.Google Scholar
- M. Nicola and B. van der Linden. Native XML support in DB2 Universal Database. In 31st International Conference on Very Large Databases VLDB, 2005. Google ScholarDigital Library
- D. Olteanu, H. Meuss, T. Furche, and F. Bry. XPath: Looking forward. In XML-Based Data Management and Multimedia Engineering in EDBT Workshops, 2002. Google ScholarDigital Library
- F. Peng and S. S. Chawathe. XPath queries on streaming data. In SIGMOD, San Diego, CA, June 9--12 2003. Google ScholarDigital Library
- M. Rys. XML and relational database management systems: Inside Microsoft SQL Server. In SIGMOD, 2005. Google ScholarDigital Library
- Transaction Processing over XML (TPoX). http://tpox.sourceforge.net/.Google Scholar
- N. Walsh, A. Milkowski, and H. S. Thompson. XProc: An XML Pipeline Language. World Wide Web Consortium, May 11 2010.Google Scholar
Index Terms
- Case studies in hardware XPath acceleration
Recommendations
Filtering XPath expressions for XML access control
XPath is a standard for specifying parts of XML documents and a suitable language for both query processing and access control of XML. In this paper, we use the XPath expression for representing user queries and access control for XML. And we propose an ...
Reformulating XPath queries and XSLT queries on XSLT views
Applications using XML for data representation very often use different XML formats and thus require the transformation of XML data. The common approach transforms entire XML documents from one format into another, e.g. by using an XSLT stylesheet. ...
Logic-based XPath optimization
DocEng '04: Proceedings of the 2004 ACM symposium on Document engineeringXPath [5] was introduced by the W3C as a standard language for specifying node selection, matching conditions, and for computing values from an XML document. XPath is now used in many XML standards such as XSLT [4] and the forthcoming XQuery [10] ...
Comments