ABSTRACT
Bing's monetization pipeline is one of the largest and most critical streaming workloads deployed in Microsoft's internal data lake. The pipeline runs 24/7 at a scale of 3500 YARN containers and is required to meet a Service Level Objective (SLO) of low tail latency. In this paper, we highlight some of the unique challenges imposed by this large scale of operation: other concurrent workloads sharing the cluster may cause random performance deterioration; unavailability of external dependencies may cause temporary stalls in the pipeline; scarcity in the underlying resource manager may cause arbitrarily long delays or rejection of container allocation requests. Weathering these challenges requires specially tailored dynamic control policies that react to these issues as and when they arise. We focus on the problem of reducing the latency in the tail, i.e., 99th percentile (p99), by detecting and mitigating slow instances through speculative replication. We show that widely used approaches do not satisfactorily solve this issue at our scale. A conservative approach is hesitant to acquire additional resources, reacts too slowly to the changes in the environment and therefore achieves little improvement in p99 latency. On the other hand, an aggressive approach overwhelms the underlying resource manager with unnecessary resource requests and paradoxically worsens the p99 latency. Our proposed approach, Spur, is designed for this challenging environment. It combines aggressive detection of slow instances with smart pruning of false positives to achieve a far better trade-off between these conflicting objectives. Using only 0.5% additional resources (similar to the conservative approach), we demonstrate a 10% -38% improvement in the tail latency compared to both conservative and aggressive approaches.
Supplemental Material
- Ashvin Agrawal and Avrilia Floratou. 2018. Dhalion in Action: Automatic Management of Streaming Applications. PVLDB, Vol. 11, 12 (2018), 2050--2053. https://doi.org/10.14778/3229863.3236257Google Scholar
- Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J Fernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, et al. 2015. The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proceedings of the VLDB Endowment, Vol. 8, 12 (2015), 1792--1803.Google ScholarDigital Library
- Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones.. In NSDI, Vol. 13. 185--198.Google ScholarDigital Library
- Ganesh Ananthanarayanan, Srikanth Kandula, Albert G Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. 2010. Reining in the Outliers in Map-Reduce Clusters using Mantri.. In Osdi, Vol. 10. 24.Google ScholarDigital Library
- Michael Armbrust, Tathagata Das, Joseph Torres, Burak Yavuz, Shixiong Zhu, Reynold Xin, Ali Ghodsi, Ion Stoica, and Matei Zaharia. 2018. Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark. In Proceedings of the 2018 International Conference on Management of Data. ACM, 601--613.Google ScholarDigital Library
- Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, Vol. 36, 4 (2015).Google Scholar
- Ronnie Chaiken et al. 2008. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. Proc. VLDB Endow., Vol. 1, 2 (Aug. 2008), 1265--1276. https://doi.org/10.14778/1454159.1454166Google Scholar
- Badrish Chandramouli et al. 2014. Trill: A High-Performance Incremental Query Processor for Diverse Analytics. Proc. VLDB Endow., Vol. 8, 4 (Dec. 2014), 401--412. https://doi.org/10.14778/2735496.2735503Google Scholar
- Carlo Curino, Subru Krishnan, Konstantinos Karanasos, Sriram Rao, Giovanni M. Fumarola, Botong Huang, Kishore Chaliparambil, Arun Suresh, Young Chen, Solom Heddaya, Roni Burd, Sarvesh Sakalanaga, Chris Douglas, Bill Ramsey, and Raghu Ramakrishnan. 2019. Hydra: a federated resource manager for data-center scale analytics. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 177--192. https://www.usenix.org/conference/nsdi19/presentation/curinoGoogle Scholar
- Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM, Vol. 56, 2 (2013), 74--80.Google ScholarDigital Library
- Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, Vol. 51, 1 (2008), 107--113.Google ScholarDigital Library
- Thanh Do, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patana-anake, and Haryadi S Gunawi. 2013. Limplock: understanding the impact of limpware on scale-out cloud systems. In Proceedings of the 4th annual Symposium on Cloud Computing. ACM, 14.Google ScholarDigital Library
- Avrilia Floratou, Ashvin Agrawal, Bill Graham, Sriram Rao, and Karthik Ramasamy. 2017. Dhalion: Self-Regulating Stream Processing in Heron. PVLDB, Vol. 10, 12 (2017), 1825--1836. https://doi.org/10.14778/3137765.3137786Google Scholar
- Maosong Fu, Ashvin Agrawal, Avrilia Floratou, Bill Graham, Andrew Jorgensen, Mark Li, Neng Lu, Karthik Ramasamy, Sriram Rao, and Cong Wang. 2017. Twitter Heron: Towards Extensible Streaming Engines. In 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, April 19--22, 2017. 1165--1172. https://doi.org/10.1109/ICDE.2017.161Google Scholar
- Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: distributed data-parallel programs from sequential building blocks. In ACM SIGOPS operating systems review, Vol. 41. ACM, 59--72.Google Scholar
- Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, Desislava Dimitrova, Matthew Forshaw, and Timothy Roscoe. 2018. Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 783--798. https://www.usenix.org/conference/osdi18/presentation/kalavriGoogle Scholar
- Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M Patel, Karthik Ramasamy, and Siddarth Taneja. 2015. Twitter heron: Stream processing at scale. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 239--250.Google ScholarDigital Library
- Wei Lin, Haochuan Fan, Zhengping Qian, Junwei Xu, Sen Yang, Jingren Zhou, and Lidong Zhou. 2016. STREAMSCOPE: Continuous Reliable Distributed Processing of Big Data Streams. In Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation (NSDI'16). USENIX Association, USA, 439--453.Google ScholarDigital Library
- Aleksander Maricq, Dmitry Duplyakin, Ivo Jimenez, Carlos Maltzahn, Ryan Stutsman, and Robert Ricci. 2018. Taming performance variability. In 13th $$USENIX$$ Symposium on Operating Systems Design and Implementation ($$OSDI$$ 18). 409--425.Google Scholar
- Shadi A. Noghabi, Kartik Paramasivam, Yi Pan, Navina Ramesh, Jon Bringhurst, Indranil Gupta, and Roy H. Campbell. 2017. Samza: Stateful Scalable Stream Processing at LinkedIn. Proc. VLDB Endow., Vol. 10, 12 (Aug. 2017), 1634--1645. https://doi.org/10.14778/3137765.3137770Google ScholarDigital Library
- Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, et al. 2014. Storm@ twitter. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, 147--156.Google ScholarDigital Library
- Vinod Kumar Vavilapalli, Arun C. Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O'Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwieler. 2013. Apache Hadoop YARN: yet another resource negotiator. In ACM Symposium on Cloud Computing, SOCC '13, Santa Clara, CA, USA, October 1--3, 2013. 5:1--5:16. https://doi.org/10.1145/2523616.2523633Google ScholarDigital Library
- Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, Ali Ghodsi, Michael J Franklin, Benjamin Recht, and Ion Stoica. 2017. Drizzle: Fast and adaptable stream processing at scale. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 374--389.Google ScholarDigital Library
- Yingjun Wu and Kian-Lee Tan. 2015. ChronoStream: Elastic stateful stream computation in the cloud. In Data Engineering (ICDE), 2015 IEEE 31st International Conference on. IEEE, 723--734.Google ScholarCross Ref
- Neeraja J Yadwadkar, Ganesh Ananthanarayanan, and Randy Katz. 2014. Wrangler: Predictable and faster jobs using fewer resources. In Proceedings of the ACM Symposium on Cloud Computing. ACM, 1--14.Google ScholarDigital Library
- Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 423--438.Google ScholarDigital Library
- Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy H Katz, and Ion Stoica. 2008. Improving MapReduce performance in heterogeneous environments.. In Osdi, Vol. 8. 7.Google ScholarDigital Library
Index Terms
- Spur: Mitigating Slow Instances in Large-Scale Streaming Pipelines
Recommendations
Reducing tail latencies in micro-batch streaming workloads
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingSpark Streaming discretizes streams of data into micro-batches, each of which is further sub-divided into tasks and processed in parallel to improve job throughput. Previous work [2, 3] has lowered end-to-end latency in Spark Streaming. However, two ...
Power balanced pipelines
HPCA '12: Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer ArchitectureSince the onset of pipelined processors, balancing the delay of the microarchitectural pipeline stages such that each microarchitectural pipeline stage has an equal delay has been a primary design objective, as it maximizes instruction throughput. ...
Comments