ABSTRACT
Recent research and bug reports have shown that work conservation, the property that a core is idle only if no other core is overloaded, is not guaranteed by Linux's CFS or FreeBSD's ULE multicore schedulers. Indeed, multicore schedulers are challenging to specify and verify: they must operate under stringent performance requirements, while handling very large numbers of concurrent operations on threads. As a consequence, the verification of correctness properties of schedulers has not yet been considered.
In this paper, we propose an approach, based on a domain-specific language and theorem provers, for developing schedulers with provable properties. We introduce the notion of concurrent work conservation (CWC), a relaxed definition of work conservation that can be achieved in a concurrent system where threads can be created, unblocked and blocked concurrently with other scheduling events. We implement several scheduling policies, inspired by CFS and ULE. We show that our schedulers obtain the same level of performance as production schedulers, while concurrent work conservation is satisfied.
- Amani, S., Hixon, A., Chen, Z., Rizkallah, C., Chubb, P., O'Connor, L., Beeren, J., Nagashima, Y., Lim, J., Sewell, T., Tuong, J., Keller, G., Murray, T., Klein, G., and Heiser, G. COGENT: Verifying high-assurance file system implementations. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2016), pp. 175--188.Google ScholarDigital Library
- Aponte, M., Courtieu, P., Moy, Y., and Sango, M. Maximal and compositional pattern-based loop invariants. In FM 2012: Formal Methods - 18th International Symposium (2012), pp. 37--51.Google ScholarCross Ref
- Bailey, D., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Dagum, L., Fatoohi, R., Frederickson, P. O., Lasinski, T. A., Schreiber, R. S., Simon, H., Venkatakrishnan, V., and Weeratunga, S. The NAS parallel benchmarks summary and preliminary results. In Supercomputing (Nov. 1991), pp. 158--165.Google ScholarDigital Library
- Barrett, C., Conway, C. L., Deters, M., Hadarean, L., Jovanovi'c, D., King, T., Reynolds, A., and Tinelli, C. CVC4. In Computer Aided Verification (CAV) (July 2011), G. Gopalakrishnan and S. Qadeer, Eds., vol. 6806 of Lecture Notes in Computer Science, Springer, pp. 171--177. Snowbird, Utah.Google ScholarCross Ref
- Bobot, F., Filliâtre, J.-C., Marché, C., and Paskevich, A. Why3: Shepherd your herd of provers. In Boogie 2011: First International Workshop on Intermediate Verification Languages (Wrocław, Poland, August 2011), pp. 53--64. https://hal.inria.fr/hal-00790310.Google Scholar
- Bouron, J. [PATCH] Fix bug in which the long term ULE load balancer is executed only once. https://bugs.freebsd.org/bugzilla/showbug.cgi?id=223914, 2017.Google Scholar
- Bouron, J., Chevalley, S., Lepers, B., Zwaenepoel, W., Gouicem, R., Lawall, J., Muller, G., and Sopena, J. The battle of the schedulers: FreeBSD ULE vs. Linux CFS. In USENIX Annual Technical Conference (USENIX ATC) (2018), pp. 85--96.Google Scholar
- Buttazzo, G. Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications (Third ed.). Springer, New York, NY, 2011.Google ScholarDigital Library
- Carver, D., Gouicem, R., Lozi, J.-P., Sopena, J., Lepers, B., Zwaenepoel, W., Palix, N., Lawall, J., and Muller, G. Fork/wait and multicore frequency scaling. In Workshop on Programming Languages and Operating Systems (PLOS) (2019), ACM Press.Google ScholarDigital Library
- Cerqueira, F., Stutz, F., and Brandenburg, B. B. PROSA: A case for readable mechanized schedulability analysis. In Real-Time Systems (ECRTS), 2016 28th Euromicro Conference on (2016), IEEE, pp. 273--284.Google ScholarCross Ref
- Chen, H., Chajed, T., Konradi, A., Wang, S., İleri, A., Chlipala, A., Kaashoek, M. F., and Zeldovich, N. Verifying a high-performance crash-safe file system using a tree specification. In Symposium on Operating Systems Principles (SOSP) (2017), pp. 270--286.Google ScholarDigital Library
- Chen, H., Ziegler, D., Chajed, T., Chlipala, A., Kaashoek, M. F., and Zeldovich, N. Using Crash Hoare logic for certifying the FSCQ file system. In Symposium on Operating Systems Principles (SOSP) (2015), pp. 18--37.Google ScholarDigital Library
- Chen, T., Ananiev, L. I., and Tikhonov, A. V. Keeping kernel performance from regressions. In Linux Symposium (2007), vol. 1, pp. 93--102.Google Scholar
- Chong, N., and Ishtiaq, S. Reasoning about the ARM weakly consistent memory model. In Workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2008), ACM, pp. 16--19.Google Scholar
- Conchon, S., Coquereau, A., Iguernlala, M., and Mebsout, A. Alt-Ergo 2.2. In SMT Workshop: International Workshop on Satisfiability Modulo Theories (Oxford, United Kingdom, July 2018).Google Scholar
- Dashti, M., Fedorova, A., Funston, J., Gaud, F., Lachaize, R., Lepers, B., Quema, V., and Roth, M. Traffic management: a holistic approach to memory placement on NUMA systems. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), pp. 381--394.Google ScholarDigital Library
- Delaware, B., Pit-Claudel, C., Gross, J., and Chlipala, A. Fiat: Deductive synthesis of abstract data types in a proof assistant. In Principles of Programming languages (POPL) (2015), pp. 689--700.Google ScholarDigital Library
- Deligiannis, P., Donaldson, A. F., and Rakamaric, Z. Fast and precise symbolic analysis of concurrency bugs in device drivers. In Automated Software Engineering (ASE) (2015), pp. 166--177.Google ScholarDigital Library
- Dobrescu, M., and Argyraki, K. Software dataplane verification. Communications of the ACM 58, 11 (2015), 113--121.Google ScholarDigital Library
- Engler, D., and Ashcraft, K. RacerX: Effective, static detection of race conditions and deadlocks. In Symposium on Operating Systems Principles (SOSP) (2003), pp. 237--252.Google ScholarDigital Library
- Erickson, J., Musuvathi, M., Burckhardt, S., and Olynyk, K. Effective data-race detection for the kernel. In Operating Systems Design and Implementation (OSDI) (2010), pp. 151--162.Google Scholar
- Filliâtre, J.-C., Gondelman, L., and Paskevich, A. The spirit of ghost code. Formal Methods in System Design 48, 3 (2016), 152--174.Google ScholarDigital Library
- Frost, C., Mammarella, M., Kohler, E., de los Reyes, A., Hovsepian, S., Matsuoka, A., and Zhang, L. Generalized file system dependencies. In Symposium on Operating Systems Principles (SOSP) (2007), pp. 307--320.Google ScholarDigital Library
- Gu, R., Shao, Z., Chen, H., Wu, X. N., Kim, J., Sjöberg, V., and Costanzo, D. CertiKOS: an extensible architecture for building certified concurrent OS kernels. In Operating Systems Design and Implementation (OSDI) (2016), pp. 653--669.Google Scholar
- Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J. R., Parno, B., Roberts, M. L., Setty, S., and Zill, B. Ironfleet: proving practical distributed systems correct. In Symposium on Operating Systems Principles (SOSP) (2015), ACM, pp. 1--17.Google ScholarDigital Library
- Hawblitzel, C., Howell, J., Lorch, J. R., Narayan, A., Parno, B., Zhang, D., and Zill, B. Ironclad apps: End-to-end security via automated full-system verification. In Operating Systems Design and Implementation (OSDI) (2014), vol. 14, pp. 165--181.Google Scholar
- Hermenier, F., and Henrio, L. Trustable virtual machine scheduling in a cloud. In Symposium on Cloud Computing (SOCC) (2017), ACM, pp. 15--26.Google ScholarDigital Library
- Kanev, S., Darago, J. P., Hazelwood, K., Ranganathan, P., Moseley, T., Wei, G.-Y., and Brooks, D. Profiling a warehouse-scale computer. In International Symposium on Computer Architecture (ISCA) (2015), pp. 158--169.Google ScholarDigital Library
- Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., and Winwood, S. seL4: formal verification of an OS kernel. In Symposium on Operating Systems Principles (SOSP) (2009), pp. 207--220.Google ScholarDigital Library
- Kuncak, V. Developing verified software using Leon. In NASA Formal Methods (NFM) (2015), pp. 12--15.Google ScholarCross Ref
- Leino, K. Rustan, M. Dafny: An automatic program verifier for functional correctness. In International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR) (2010), pp. 348--370.Google ScholarCross Ref
- Lepers, B., Zwaenepoel, W., Lozi, J., Palix, N., Gouicem, R., Sopena, J., Lawall, J., and Muller, G. Towards proving optimistic multicore schedulers. In Workshop on Hot Topics in Operating Systems (HotOS) (2017), pp. 18--23.Google ScholarDigital Library
- Leroy, X. Formal certification of a compiler back-end or: programming a compiler with a proof assistant. In Principles of Programming languages (POPL) (2006), pp. 42--54.Google ScholarDigital Library
- Linux test project. https://linux-test-project.github.io/, 2012.Google Scholar
- Liu, X., Guo, Z., Wang, X., Chen, F., Lian, X., Tang, J., Wu, M., Kaashoek, M. F., and Zhang, Z. D3S: debugging deployed distributed systems. In Networked Systems Design and Implementation (NSDI) (2008), pp. 423--437.Google Scholar
- Lozi, J.-P., Lepers, B., Funston, J., Gaud, F., Quéma, V., and Fedorova, A. The Linux scheduler: a decade of wasted cores. In European Conference on Computer Systems (EuroSys) (2016), ACM, pp. 1:1--1:16.Google ScholarDigital Library
- Mai, H., Pek, E., Xue, H., King, S. T., and Madhusudan, P. Verifying security invariants in ExpressOS. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), pp. 293--304.Google ScholarDigital Library
- Mérillon, F., Réveillère, L., Consel, C., Marlet, R., and Muller, G. Devil: An IDL for hardware programming. In Operating Systems Design and Implementation (OSDI) (2000), pp. 17--30.Google Scholar
- Muller, G., Consel, C., Marlet, R., Barreto, L. P., Merillon, F., and Reveillere, L. Towards robust OSes for appliances: A new approach based on domain-specific languages. In ACM SIGOPS European workshop (2000), ACM, pp. 19--24.Google ScholarDigital Library
- Muller, G., Lawall, J. L., and Duchesne, H. A framework for simplifying the development of kernel schedulers: Design and performance evaluation. In High-Assurance Systems Engineering (HASE) (2005), IEEE, pp. 56--65.Google Scholar
- Musuvathi, M., Park, D. Y. W., Chou, A., Engler, D. R., and Dill, D. L. CMC: a pragmatic approach to model checking real code. In Operating Systems Design and Implementation (OSDI) (2002), pp. 75--88.Google ScholarCross Ref
- Nelson, L., Sigurbjarnarson, H., Zhang, K., Johnson, D., Bornholt, J., Torlak, E., and Wang, X. Hyperkernel: Push-button verification of an OS kernel. In Symposium on Operating Systems Principles (SOSP) (2017), pp. 252--269.Google ScholarDigital Library
- Perl, S. E., and Weihl, W. E. Performance assertion checking. In Symposium on Operating Systems Principles (SOSP) (1993), pp. 134--145.Google ScholarDigital Library
- Sahoo, S. K., Criswell, J., Geigle, C., and Adve, V. Using likely invariants for automated software fault localization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), pp. 139--152.Google ScholarDigital Library
- Savage, S., Burrows, M., Nelson, G., Sobalvarro, P., and Anderson, T. Eraser: a dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS) 15, 4 (Nov. 1997), 391--411.Google Scholar
- Scheduler domains. https://www.kernel.org/doc/html/latest/scheduler/sched-domains.html.Google Scholar
- Schüpbach, A., Peter, S., Baumann, A., Roscoe, T., Barham, P., Harris, T., and Isaacs, R. Embracing diversity in the Barrelfish manycore operating system. In Workshop on Managed Many-Core Systems (2008), vol. 27.Google Scholar
- Shen, K., Zhong, M., and Li, C. I/O system performance debugging using model-driven anomaly characterization. In File and Storage Technologies (FAST) (2005), pp. 309--322.Google Scholar
- Sigurbjarnarson, H., Bornholt, J., Torlak, E., and Wang, X. Push-button verification of file systems via crash refinement. In Operating Systems Design and Implementation (OSDI) (2016), pp. 1--16.Google Scholar
- Sites, D. Data center computers: Modern challenges in CPU design, 2015. https://www.youtube.com/watch?v=QBu2Ae8-8LM(56:32).Google Scholar
- Sysbench. https://github.com/akopytov/sysbench.Google Scholar
- Vojdani, V., Apinis, K., Rõtov, V., Seidl, H., Vene, V., and Vogler, R. Static race detection for device drivers: The Goblint approach. In Automated Software Engineering (ASE) (2016), IEEE, pp. 391--402.Google ScholarDigital Library
- Waldspurger, C. A., and Weihl, W. E. Lottery scheduling: Flexible proportional-share resource management. In Operating Systems Design and Implementation (OSDI) (1994), pp. 1--11.Google Scholar
- Wand, M. Continuation-based multiprocessing. In LISP and Functional Programming (1980), pp. 19--28.Google ScholarDigital Library
- Wilcox, J. R., Woos, D., Panchekha, P., Tatlock, Z., Wang, X., Ernst, M. D., and Anderson, T. E. Verdi: a framework for implementing and formally verifying distributed systems. In Programming Language Design and Implementation (PLDI) (2015), pp. 357--368.Google ScholarDigital Library
- Xu, C., and Lau, F. C. M. Load balancing in parallel computers: theory and practice, vol. 381. Springer Science & Business Media, 1996.Google Scholar
- Yang, J., Twohey, P., Engler, D., and Musuvathi, M. Using model checking to find serious file system errors. ACM Transactions on Computer Systems (TOCS) 24, 4 (Nov. 2006), 393--423.Google Scholar
Index Terms
- Provable multicore schedulers with Ipanema: application to work conservation
Recommendations
A flexible simulation framework for multicore schedulers: work in progress paper
SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete SimulationAs multicore processors are becoming the norm, an efficient scheduling of cores to the threads is fundamentally important for multicore computing. To study the performance of a new scheduling algorithm for the future multicore systems with hundreds and ...
Evaluating Schedulers in a Reconfigurable Multicore Heterogeneous System
Proceedings of the 12th International Symposium on Applied Reconfigurable Computing - Volume 9625The use of heterogeneous multicore processors is getting extremely common, and those that comprise reconfigurable logic are becoming an attractive alternative. However, to leverage them as much as possible to speed up applications, an effective ...
EDZL Schedulability Analysis in Real-Time Multicore Scheduling
In real-time systems, correctness depends not only on functionality but also on timeliness. A great number of scheduling theories have been developed for verification of the temporal correctness of jobs (software) in such systems. Among them, the ...
Comments