skip to main content
10.1145/3342195.3387544acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Provable multicore schedulers with Ipanema: application to work conservation

Published:17 April 2020Publication History

ABSTRACT

Recent research and bug reports have shown that work conservation, the property that a core is idle only if no other core is overloaded, is not guaranteed by Linux's CFS or FreeBSD's ULE multicore schedulers. Indeed, multicore schedulers are challenging to specify and verify: they must operate under stringent performance requirements, while handling very large numbers of concurrent operations on threads. As a consequence, the verification of correctness properties of schedulers has not yet been considered.

In this paper, we propose an approach, based on a domain-specific language and theorem provers, for developing schedulers with provable properties. We introduce the notion of concurrent work conservation (CWC), a relaxed definition of work conservation that can be achieved in a concurrent system where threads can be created, unblocked and blocked concurrently with other scheduling events. We implement several scheduling policies, inspired by CFS and ULE. We show that our schedulers obtain the same level of performance as production schedulers, while concurrent work conservation is satisfied.

References

  1. Amani, S., Hixon, A., Chen, Z., Rizkallah, C., Chubb, P., O'Connor, L., Beeren, J., Nagashima, Y., Lim, J., Sewell, T., Tuong, J., Keller, G., Murray, T., Klein, G., and Heiser, G. COGENT: Verifying high-assurance file system implementations. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2016), pp. 175--188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aponte, M., Courtieu, P., Moy, Y., and Sango, M. Maximal and compositional pattern-based loop invariants. In FM 2012: Formal Methods - 18th International Symposium (2012), pp. 37--51.Google ScholarGoogle ScholarCross RefCross Ref
  3. Bailey, D., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Dagum, L., Fatoohi, R., Frederickson, P. O., Lasinski, T. A., Schreiber, R. S., Simon, H., Venkatakrishnan, V., and Weeratunga, S. The NAS parallel benchmarks summary and preliminary results. In Supercomputing (Nov. 1991), pp. 158--165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Barrett, C., Conway, C. L., Deters, M., Hadarean, L., Jovanovi'c, D., King, T., Reynolds, A., and Tinelli, C. CVC4. In Computer Aided Verification (CAV) (July 2011), G. Gopalakrishnan and S. Qadeer, Eds., vol. 6806 of Lecture Notes in Computer Science, Springer, pp. 171--177. Snowbird, Utah.Google ScholarGoogle ScholarCross RefCross Ref
  5. Bobot, F., Filliâtre, J.-C., Marché, C., and Paskevich, A. Why3: Shepherd your herd of provers. In Boogie 2011: First International Workshop on Intermediate Verification Languages (Wrocław, Poland, August 2011), pp. 53--64. https://hal.inria.fr/hal-00790310.Google ScholarGoogle Scholar
  6. Bouron, J. [PATCH] Fix bug in which the long term ULE load balancer is executed only once. https://bugs.freebsd.org/bugzilla/showbug.cgi?id=223914, 2017.Google ScholarGoogle Scholar
  7. Bouron, J., Chevalley, S., Lepers, B., Zwaenepoel, W., Gouicem, R., Lawall, J., Muller, G., and Sopena, J. The battle of the schedulers: FreeBSD ULE vs. Linux CFS. In USENIX Annual Technical Conference (USENIX ATC) (2018), pp. 85--96.Google ScholarGoogle Scholar
  8. Buttazzo, G. Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications (Third ed.). Springer, New York, NY, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Carver, D., Gouicem, R., Lozi, J.-P., Sopena, J., Lepers, B., Zwaenepoel, W., Palix, N., Lawall, J., and Muller, G. Fork/wait and multicore frequency scaling. In Workshop on Programming Languages and Operating Systems (PLOS) (2019), ACM Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cerqueira, F., Stutz, F., and Brandenburg, B. B. PROSA: A case for readable mechanized schedulability analysis. In Real-Time Systems (ECRTS), 2016 28th Euromicro Conference on (2016), IEEE, pp. 273--284.Google ScholarGoogle ScholarCross RefCross Ref
  11. Chen, H., Chajed, T., Konradi, A., Wang, S., İleri, A., Chlipala, A., Kaashoek, M. F., and Zeldovich, N. Verifying a high-performance crash-safe file system using a tree specification. In Symposium on Operating Systems Principles (SOSP) (2017), pp. 270--286.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chen, H., Ziegler, D., Chajed, T., Chlipala, A., Kaashoek, M. F., and Zeldovich, N. Using Crash Hoare logic for certifying the FSCQ file system. In Symposium on Operating Systems Principles (SOSP) (2015), pp. 18--37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chen, T., Ananiev, L. I., and Tikhonov, A. V. Keeping kernel performance from regressions. In Linux Symposium (2007), vol. 1, pp. 93--102.Google ScholarGoogle Scholar
  14. Chong, N., and Ishtiaq, S. Reasoning about the ARM weakly consistent memory model. In Workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2008), ACM, pp. 16--19.Google ScholarGoogle Scholar
  15. Conchon, S., Coquereau, A., Iguernlala, M., and Mebsout, A. Alt-Ergo 2.2. In SMT Workshop: International Workshop on Satisfiability Modulo Theories (Oxford, United Kingdom, July 2018).Google ScholarGoogle Scholar
  16. Dashti, M., Fedorova, A., Funston, J., Gaud, F., Lachaize, R., Lepers, B., Quema, V., and Roth, M. Traffic management: a holistic approach to memory placement on NUMA systems. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), pp. 381--394.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Delaware, B., Pit-Claudel, C., Gross, J., and Chlipala, A. Fiat: Deductive synthesis of abstract data types in a proof assistant. In Principles of Programming languages (POPL) (2015), pp. 689--700.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Deligiannis, P., Donaldson, A. F., and Rakamaric, Z. Fast and precise symbolic analysis of concurrency bugs in device drivers. In Automated Software Engineering (ASE) (2015), pp. 166--177.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Dobrescu, M., and Argyraki, K. Software dataplane verification. Communications of the ACM 58, 11 (2015), 113--121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Engler, D., and Ashcraft, K. RacerX: Effective, static detection of race conditions and deadlocks. In Symposium on Operating Systems Principles (SOSP) (2003), pp. 237--252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Erickson, J., Musuvathi, M., Burckhardt, S., and Olynyk, K. Effective data-race detection for the kernel. In Operating Systems Design and Implementation (OSDI) (2010), pp. 151--162.Google ScholarGoogle Scholar
  22. Filliâtre, J.-C., Gondelman, L., and Paskevich, A. The spirit of ghost code. Formal Methods in System Design 48, 3 (2016), 152--174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Frost, C., Mammarella, M., Kohler, E., de los Reyes, A., Hovsepian, S., Matsuoka, A., and Zhang, L. Generalized file system dependencies. In Symposium on Operating Systems Principles (SOSP) (2007), pp. 307--320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Gu, R., Shao, Z., Chen, H., Wu, X. N., Kim, J., Sjöberg, V., and Costanzo, D. CertiKOS: an extensible architecture for building certified concurrent OS kernels. In Operating Systems Design and Implementation (OSDI) (2016), pp. 653--669.Google ScholarGoogle Scholar
  25. Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J. R., Parno, B., Roberts, M. L., Setty, S., and Zill, B. Ironfleet: proving practical distributed systems correct. In Symposium on Operating Systems Principles (SOSP) (2015), ACM, pp. 1--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hawblitzel, C., Howell, J., Lorch, J. R., Narayan, A., Parno, B., Zhang, D., and Zill, B. Ironclad apps: End-to-end security via automated full-system verification. In Operating Systems Design and Implementation (OSDI) (2014), vol. 14, pp. 165--181.Google ScholarGoogle Scholar
  27. Hermenier, F., and Henrio, L. Trustable virtual machine scheduling in a cloud. In Symposium on Cloud Computing (SOCC) (2017), ACM, pp. 15--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kanev, S., Darago, J. P., Hazelwood, K., Ranganathan, P., Moseley, T., Wei, G.-Y., and Brooks, D. Profiling a warehouse-scale computer. In International Symposium on Computer Architecture (ISCA) (2015), pp. 158--169.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., and Winwood, S. seL4: formal verification of an OS kernel. In Symposium on Operating Systems Principles (SOSP) (2009), pp. 207--220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kuncak, V. Developing verified software using Leon. In NASA Formal Methods (NFM) (2015), pp. 12--15.Google ScholarGoogle ScholarCross RefCross Ref
  31. Leino, K. Rustan, M. Dafny: An automatic program verifier for functional correctness. In International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR) (2010), pp. 348--370.Google ScholarGoogle ScholarCross RefCross Ref
  32. Lepers, B., Zwaenepoel, W., Lozi, J., Palix, N., Gouicem, R., Sopena, J., Lawall, J., and Muller, G. Towards proving optimistic multicore schedulers. In Workshop on Hot Topics in Operating Systems (HotOS) (2017), pp. 18--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Leroy, X. Formal certification of a compiler back-end or: programming a compiler with a proof assistant. In Principles of Programming languages (POPL) (2006), pp. 42--54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Linux test project. https://linux-test-project.github.io/, 2012.Google ScholarGoogle Scholar
  35. Liu, X., Guo, Z., Wang, X., Chen, F., Lian, X., Tang, J., Wu, M., Kaashoek, M. F., and Zhang, Z. D3S: debugging deployed distributed systems. In Networked Systems Design and Implementation (NSDI) (2008), pp. 423--437.Google ScholarGoogle Scholar
  36. Lozi, J.-P., Lepers, B., Funston, J., Gaud, F., Quéma, V., and Fedorova, A. The Linux scheduler: a decade of wasted cores. In European Conference on Computer Systems (EuroSys) (2016), ACM, pp. 1:1--1:16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Mai, H., Pek, E., Xue, H., King, S. T., and Madhusudan, P. Verifying security invariants in ExpressOS. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), pp. 293--304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Mérillon, F., Réveillère, L., Consel, C., Marlet, R., and Muller, G. Devil: An IDL for hardware programming. In Operating Systems Design and Implementation (OSDI) (2000), pp. 17--30.Google ScholarGoogle Scholar
  39. Muller, G., Consel, C., Marlet, R., Barreto, L. P., Merillon, F., and Reveillere, L. Towards robust OSes for appliances: A new approach based on domain-specific languages. In ACM SIGOPS European workshop (2000), ACM, pp. 19--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Muller, G., Lawall, J. L., and Duchesne, H. A framework for simplifying the development of kernel schedulers: Design and performance evaluation. In High-Assurance Systems Engineering (HASE) (2005), IEEE, pp. 56--65.Google ScholarGoogle Scholar
  41. Musuvathi, M., Park, D. Y. W., Chou, A., Engler, D. R., and Dill, D. L. CMC: a pragmatic approach to model checking real code. In Operating Systems Design and Implementation (OSDI) (2002), pp. 75--88.Google ScholarGoogle ScholarCross RefCross Ref
  42. Nelson, L., Sigurbjarnarson, H., Zhang, K., Johnson, D., Bornholt, J., Torlak, E., and Wang, X. Hyperkernel: Push-button verification of an OS kernel. In Symposium on Operating Systems Principles (SOSP) (2017), pp. 252--269.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Perl, S. E., and Weihl, W. E. Performance assertion checking. In Symposium on Operating Systems Principles (SOSP) (1993), pp. 134--145.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Sahoo, S. K., Criswell, J., Geigle, C., and Adve, V. Using likely invariants for automated software fault localization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), pp. 139--152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Savage, S., Burrows, M., Nelson, G., Sobalvarro, P., and Anderson, T. Eraser: a dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS) 15, 4 (Nov. 1997), 391--411.Google ScholarGoogle Scholar
  46. Scheduler domains. https://www.kernel.org/doc/html/latest/scheduler/sched-domains.html.Google ScholarGoogle Scholar
  47. Schüpbach, A., Peter, S., Baumann, A., Roscoe, T., Barham, P., Harris, T., and Isaacs, R. Embracing diversity in the Barrelfish manycore operating system. In Workshop on Managed Many-Core Systems (2008), vol. 27.Google ScholarGoogle Scholar
  48. Shen, K., Zhong, M., and Li, C. I/O system performance debugging using model-driven anomaly characterization. In File and Storage Technologies (FAST) (2005), pp. 309--322.Google ScholarGoogle Scholar
  49. Sigurbjarnarson, H., Bornholt, J., Torlak, E., and Wang, X. Push-button verification of file systems via crash refinement. In Operating Systems Design and Implementation (OSDI) (2016), pp. 1--16.Google ScholarGoogle Scholar
  50. Sites, D. Data center computers: Modern challenges in CPU design, 2015. https://www.youtube.com/watch?v=QBu2Ae8-8LM(56:32).Google ScholarGoogle Scholar
  51. Sysbench. https://github.com/akopytov/sysbench.Google ScholarGoogle Scholar
  52. Vojdani, V., Apinis, K., Rõtov, V., Seidl, H., Vene, V., and Vogler, R. Static race detection for device drivers: The Goblint approach. In Automated Software Engineering (ASE) (2016), IEEE, pp. 391--402.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Waldspurger, C. A., and Weihl, W. E. Lottery scheduling: Flexible proportional-share resource management. In Operating Systems Design and Implementation (OSDI) (1994), pp. 1--11.Google ScholarGoogle Scholar
  54. Wand, M. Continuation-based multiprocessing. In LISP and Functional Programming (1980), pp. 19--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Wilcox, J. R., Woos, D., Panchekha, P., Tatlock, Z., Wang, X., Ernst, M. D., and Anderson, T. E. Verdi: a framework for implementing and formally verifying distributed systems. In Programming Language Design and Implementation (PLDI) (2015), pp. 357--368.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Xu, C., and Lau, F. C. M. Load balancing in parallel computers: theory and practice, vol. 381. Springer Science & Business Media, 1996.Google ScholarGoogle Scholar
  57. Yang, J., Twohey, P., Engler, D., and Musuvathi, M. Using model checking to find serious file system errors. ACM Transactions on Computer Systems (TOCS) 24, 4 (Nov. 2006), 393--423.Google ScholarGoogle Scholar

Index Terms

  1. Provable multicore schedulers with Ipanema: application to work conservation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      EuroSys '20: Proceedings of the Fifteenth European Conference on Computer Systems
      April 2020
      49 pages
      ISBN:9781450368827
      DOI:10.1145/3342195

      Copyright © 2020 ACM

      © 2020 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 April 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      EuroSys '20 Paper Acceptance Rate43of234submissions,18%Overall Acceptance Rate241of1,308submissions,18%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader