skip to main content
10.1145/3009837.3009839acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Mixed-size concurrency: ARM, POWER, C/C++11, and SC

Published:01 January 2017Publication History

ABSTRACT

Previous work on the semantics of relaxed shared-memory concurrency has only considered the case in which each load reads the data of exactly one store. In practice, however, multiprocessors support mixed-size accesses, and these are used by systems software and (to some degree) exposed at the C/C++ language level. A semantic foundation for software, therefore, has to address them.

We investigate the mixed-size behaviour of ARMv8 and IBM POWER architectures and implementations: by experiment, by developing semantic models, by testing the correspondence between these, and by discussion with ARM and IBM staff. This turns out to be surprisingly subtle, and on the way we have to revisit the fundamental concepts of coherence and sequential consistency, which change in this setting. In particular, we show that adding a memory barrier between each instruction does not restore sequential consistency. We go on to extend the C/C++11 model to support non-atomic mixed-size memory accesses.

This is a necessary step towards semantics for real-world shared-memory concurrent code, beyond litmus tests.

References

  1. L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput., C-28(9):690– 691, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. M. Censier and P. Feautrier. A new solution to coherence problems in multicache systems. IEEE Trans. Comput., 27(12):1112–1118, December 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. William W. Collier. Principles of architecture for systems of parallel processes. Technical Report TR 00.3100, IBM Poughkeepsie, 1981.Google ScholarGoogle Scholar
  4. Michel Dubois, Christoph Scheurich, and Faye A. Briggs. Memory access buffering in multiprocessors. In Proc. ISCA ’86, pages 434– 442, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Misra. Axioms for memory access in asynchronous hardware systems. ACM Trans. Program. Lang. Syst., 8(1):142–153, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dennis Shasha and Marc Snir. Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst., 10(2):282–312, April 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. James R. Goodman. Cache consistency and sequential consistency. Technical Report Technical Report 61, IEEE Scalable Coherent Interface (SCI) Working Group, March 1989.Google ScholarGoogle Scholar
  8. Sarita V. Adve and Mark D. Hill. Weak ordering — a new definition. In Proc. ISCA ’90, pages 2–14. ACM, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kourosh Gharachorloo, Daniel Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta, and John Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proc. ISCA ’90, pages 15–26. ACM, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. William W. Collier. Reasoning About Parallel Architectures. Prentice-Hall, Inc., 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Pradeep S. Sindhu, Jean-Marc Frailong, and Michel Cekleov. Formal Specification of Memory Models, pages 25–41. Springer US, 1992.Google ScholarGoogle Scholar
  12. Prince Kohli, Gil Neiger, and Mustaque Ahamad. A characterization of scalable shared memories. In ICPP: International Conference on Parallel Processing, pages 332–335, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Corella, J. M. Stone, and C. M. Barton. A formal specification of the PowerPC shared memory architecture. Technical Report RC18638, IBM, 1993.Google ScholarGoogle Scholar
  14. David L Dill, Seungjoon Park, and Andreas G. Nowatzyk. Formal specification of abstract memory models. In Proceedings of the 1993 Symposium on Research on Integrated Systems, pages 38–52. MIT Press, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. The SPARC Architecture Manual, Version 9. SPARC Int., Inc., 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hagit Attiya and Roy Friedman. Programming DEC-Alpha based multiprocessors the easy way (extended abstract). In Proc. SPAA, pages 157–166, New York, NY, USA, 1994. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. José M. Bernabéu-Aubán and Vicente Cholvi-juan. Formalizing memory coherency models. Journal of Computing and Information, 1:653–672, 1994.Google ScholarGoogle Scholar
  18. K. Gharachorloo. Memory consistency models for shared-memory multiprocessors. WRL Research Report, 95(9), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mustaque Ahamad, Gil Neiger, James E. Burns, Prince Kohli, and Phillip W. Hutto. Causal memory: definitions, implementation, and programming. Distributed Computing, 9(1):37–49, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lisa Higham, Jalal Kawash, and Nathaly Verwaal. Weak memory consistency models. Part I: Definitions and comparisons. Technical report, Department of Computer Science, University of Calgary, 1998.Google ScholarGoogle Scholar
  21. Prosenjit Chatterjee and Ganesh Gopalakrishnan. Towards a formal model of shared memory consistency for Intel Itaniumtm. In 19th International Conference on Computer Design (ICCD 2001), September 2001, Austin, TX, USA, pages 515–518, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Intel. A formal specification of Intel Itanium processor family memory ordering, 2002. http://download.intel.com/design/ Itanium/Downloads/25142901.pdf.Google ScholarGoogle Scholar
  23. A. Adir, H. Attiya, and G. Shurek. Information-flow models for shared memory with an application to the PowerPC architecture. IEEE Trans. Parallel Distrib. Syst., 14(5):502–515, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yue Yang, Ganesh Gopalakrishnan, Gary Lindstrom, and Konrad Slind. Nemos: A framework for axiomatic and executable specifications of memory consistency models. In 18th International Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, New Mexico, USA, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  25. Lisa Higham, LillAnne Jackson, and Jalal Kawash. Programmercentric conditions for Itanium memory consistency. In Proceedings of the 8th International Conference on Distributed Computing and Networking, ICDCN’06, pages 58–69. Springer-Verlag, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Arvind Arvind and Jan-Willem Maessen. Memory model = instruction reordering + store atomicity. In Proc. ISCA ’06, pages 29–40. IEEE Computer Society, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Chong and S. Ishtiaq. Reasoning about the ARM weakly consistent memory model. In MSPC, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus Myreen, and Jade Alglave. The semantics of x86-CC multiprocessor machine code. In Proc. POPL 2009, pages 379–391, January 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Alglave, A. Fox, S. Ishtiaq, M. O. Myreen, S. Sarkar, P. Sewell, and F. Zappa Nardelli. The semantics of Power and ARM multiprocessor machine code. In Proc. DAMP 2009, January 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Fences in weak memory models. In Proc. CAV, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Scott Owens, Susmit Sarkar, and Peter Sewell. A better x86 memory model: x86-TSO. In Proceedings of TPHOLs 2009: Theorem Proving in Higher Order Logics, LNCS 5674, pages 391–407, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, and Magnus O. Myreen. x86-TSO: A rigorous and usable programmer’s model for x86 multiprocessors. Communications of the ACM, 53(7):89–97, July 2010. (Research Highlights). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. Understanding POWER multiprocessors. In Proc. PLDI ’11, pages 175–186, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Mark Batty, Kayvan Memarian, Scott Owens, Susmit Sarkar, and Peter Sewell. Clarifying and Compiling C/C++ Concurrency: from C++11 to POWER. In Proc. POPL 2012, pages 509–520, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams. Synchronising C/C++ and POWER. In Proceedings of PLDI 2012, the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (Beijing), pages 311–322, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Luc Maranget, Susmit Sarkar, and Peter Sewell. A tutorial introduction to the ARM and POWER relaxed memory models. Draft available from http://www.cl.cam.ac.uk/~pes20/ ppc-supplemental/test7.pdf, 2012.Google ScholarGoogle Scholar
  37. Jade Alglave, Luc Maranget, and Michael Tautschnig. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM TOPLAS, 36(2):7:1–7:74, July 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Kathryn E. Gray, Gabriel Kerneis, Dominic Mulligan, Christopher Pulte, Susmit Sarkar, and Peter Sewell. An integrated concurrency and core-ISA architectural envelope definition, and test oracle, for IBM POWER multiprocessors. In Proc. MICRO-48, the 48th Annual IEEE/ACM International Symposium on Microarchitecture, December 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. Modelling the ARMv8 architecture, operationally: Concurrency and ISA. In Proceedings of POPL: the 43rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sizhuo Zhang, Arvind, and Muralidaran Vijayaraghavan. Taming weak memory models. CoRR, abs/1606.05416, 2016.Google ScholarGoogle Scholar
  41. Linux kernel lockrefs. https://lwn.net/Articles/565734/, http://git.kernel.org/cgit/linux/kernel/git/torvalds/ linux.git/tree/lib/lockref.c, http://git.kernel.org/cgit/ linux/kernel/git/torvalds/linux.git/tree/include/linux/ lockref.h.Google ScholarGoogle Scholar
  42. ARM Ltd. ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile), 2015. ARM DDI 0487A.h (ID092915).Google ScholarGoogle Scholar
  43. Power ISATM Version 2.07. IBM, 2013.Google ScholarGoogle Scholar
  44. Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. Litmus: running tests against hardware. In Proceedings of TACAS 2011, pages 41–44. Springer-Verlag, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. H.-J. Boehm and S. Adve. Foundations of the C++ concurrency memory model. In Proc. PLDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C++ concurrency. In Proc. POPL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Yatin A. Manerkar, Caroline Trippel, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. Counterexamples and proof loophole for the C/C++ to POWER and ARMv7 trailing-sync compiler mappings. CoRR, abs/1611.01507, 2016.Google ScholarGoogle Scholar
  48. Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. Repairing sequential consistency in C/C++11. Note, available at http://plv.mpi-sws.org/scfix/, 2016.Google ScholarGoogle Scholar
  49. Susmit Sarkar and Peter Sewell. Corrigendum: C/C++11 to POWER concurrency compilation scheme correctness proof. Note, available at http://www.cl.cam.ac.uk/users/pes20/cppppc/corrigendum. html, December 2016.Google ScholarGoogle Scholar
  50. P. Cenciarelli, A. Knapp, and E. Sibilio. The Java memory model: Operationally, denotationally, axiomatically. In ESOP, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. J. Ševˇcík and D. Aspinall. On validity of program transformations in the Java memory model. In ECOOP, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Mark Batty, Kayvan Memarian, Kyndylan Nienhuis, Jean Pichon-Pharabod, and Peter Sewell. The problem of programming language concurrency semantics. In Proceedings of ESOP 2015, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  53. Jean Pichon-Pharabod and Peter Sewell. A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions. In Proceedings of POPL, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter Sewell. Supplementary material. http://www.cl.cam.ac.uk/ ~pes20/popl17/,Google ScholarGoogle Scholar
  55. Dominic P. Mulligan, Scott Owens, Kathryn E. Gray, Tom Ridge, and Peter Sewell. Lem: reusable engineering of real-world semantics. In Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming, pages 175–188, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. P. Becker, editor. Programming Languages — C++. 2011. ISO/IEC 14882:2011. http://www.open-std.org/jtc1/sc22/wg21/docs/ papers/2011/n3242.pdf.Google ScholarGoogle Scholar
  57. Mark John Batty. The C11 and C++11 Concurrency Model. PhD thesis, University of Cambridge Computer Laboratory, 2014.Google ScholarGoogle Scholar
  58. P. E. McKenney and R. Silvera. Example POWER implementation for C/C++ memory model. http://www.rdrop.com/users/paulmck/ scalability/paper/N2745r.2011.03.04a.html, 2011.Google ScholarGoogle Scholar
  59. Jade Alglave and Luc Maranget. The diy tool. http://diy.inria. fr/.Google ScholarGoogle Scholar
  60. Mark Batty, Mike Dodds, and Alexey Gotsman. Library abstraction for C/C++ concurrency. In Proc. POPL ’13, pages 235–248. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Aaron Turon, Viktor Vafeiadis, and Derek Dreyer. GPS: Navigating weak memory with ghosts, protocols, and separation. In Proc. OOPSLA ’14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Richard Bornat, Jade Alglave, and Matthew J. Parkinson. New lace and arsenic: adventures in weak memory with a program logic. CoRR, abs/1512.01416, 2015.Google ScholarGoogle Scholar

Index Terms

  1. Mixed-size concurrency: ARM, POWER, C/C++11, and SC

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages
            January 2017
            901 pages
            ISBN:9781450346603
            DOI:10.1145/3009837

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 January 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate824of4,130submissions,20%

            Upcoming Conference

            POPL '25

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader