short-paper

Stencil Autotuning with Ordinal Regression: Extended Abstract

Authors:
Biagio Cosenza

Technische Universität Berlin

Technische Universität Berlin
View Profile

,
Juan J. Durillo

University of Innsbruck

University of Innsbruck
View Profile

,
Stefano Ermon

Stanford University

Stanford University
View Profile

,
Ben Juurlink

Technische Universität Berlin

Technische Universität Berlin
View Profile

SCOPES '17: Proceedings of the 20th International Workshop on Software and Compilers for Embedded SystemsJune 2017Pages 72–75https://doi.org/10.1145/3078659.3078664

Published:12 June 2017Publication History

SCOPES '17: Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems

Pages 72–75

ABSTRACT

The increasing performance of today's computer architecture comes with an unprecedented augment of hardware complexity. Unfortunately this results in difficult-to-tune software and consequentially in a gap between the potential peak performance and the actual performance. Automatic tuning is an emerging approach that assists the programmer in managing this complexity. State-of-the-art autotuners are limited, though: they either require long tuning times, e.g., due to iterative searches, or cannot tackle the complexity of the problem due to the limitation of the supervised machine learning (ML) methodologies used. In particular, traditional ML autotuning approaches exploiting classification algorithms (such as neural networks and support vector machines) face difficulties in capturing all features of large search spaces. We propose a new way of performing automatic tuning based on structural learning: the tuning problem is formulated as a version ranking prediction modeling and solved using ordinal regression. We demonstrate its potential on a well-known autotuning problem: stencil computations. We compare state-of-the-art iterative compilation methods with our ordinal regression approach and analyze the quality of the obtained ranking in terms of Kendall rank correlation coefficients.

References

Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. 2014. OpenTuner: An Extensible Framework for Program Autotuning. In Int. Conference on Parallel Architectures and Compilation Techniques (PACT). Google ScholarDigital Library
Gükhan H. Bakir, Thomas Hofmann, Bernhard Schölkopf, Alexander J. Smola, Ben Taskar, and S. V. N. Vishwanathan. 2007. Predicting Structured Data (Neural Information Processing). The MIT Press. Google ScholarDigital Library
Cosenza Biagio, Juan J. Durillo, Stefano Ermon, and Ben Juurlink. 2017. Autotuning Stencil Computations with Structural Ordinal Regression Learning. In IEEE International Parallel and Distributed Processing Symposium (IPDPS).Google Scholar
Matthias Christen, Olaf Schenk, and Helmar Burkhart. 2011. PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures. In IEEE International Parallel & Distributed Processing Symposium (IPDPS). 676--687. Google ScholarDigital Library
Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. In ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD). 133--142. Google ScholarDigital Library
Thorsten Joachims. 2006. Training Linear SVMs in Linear Time. In ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining (KDD). 217--226. Google ScholarDigital Library
Maurice Kendall. 1976. Rank Correlation Methods (4 ed.). Hodder Arnold.Google Scholar
Klaus Kofler, Ivan Grasso, Biagio Cosenza, and Thomas Fahringer. 2013. An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning. In ACM International Conference on Super computing (ICS). 149--160. Google ScholarDigital Library
Hugh Leather, Edwin Bonilla, and Michael O'Boyle. 2009. Automatic Feature Generation for Machine Learning Based Optimizing Compilation. In Int. Symp. on Code Generation and Optimization (CGO). 81--91. Google ScholarDigital Library
Yulong Luo, Guangming Tan, Zeyao Mo, and Ninghui Sun. 2015. FAST: A Fast Stencil Autotuning Framework Based On An Optimal-solution Space Model. In ACM Int. Conference on Supercomputing (ICS). 187--196. Google ScholarDigital Library
S. Muralidharan, M. Shantharam, M. Hall, M. Garland, and B. Catanzaro. 2014. Nitro: A Framework for Adaptive Code Variant Tuning. In IEEE International Parallel and Distributed Processing Symposium (IPDPS). 501--512. Google ScholarDigital Library
Mark Stephenson and Saman P. Amarasinghe. 2005. Predicting Unroll Factors Using Supervised Classification. In IEEE / ACM International Symposium on Code Generation and Optimization (CGO). 123--134. Google ScholarDigital Library
Kevin Stock, Louis-Noél Pouchet, and P. Sadayappan. 2012. Using Machine Learning to Improve Automatic Vectorization. ACM Trans. Archit. Code Optim. 8, 4, Article 50 (Jan. 2012), 23 pages. Google ScholarDigital Library

Recommendations

Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE

We are witnessing the consolidation of the heterogeneous computing in parallel computing with architectures such as Cell Broadband Engine (Cell BE) or Graphics Processing Units (GPUs) which are present in a myriad of developments for high performance ...
Read More
Automatic code generation and tuning for stencil kernels on modern shared memory architectures

In this paper, we present Patus, a code generation and auto-tuning framework for stencil computations targeted at multi- and manycore processors, such as multicore CPUs and graphics processing units. Patus, which stands for " P arallel A uto tu ned S ...
Read More
Autotuning GEMM Kernels for the Fermi GPU

In recent years, the use of graphics chips has been recognized as a viable way of accelerating scientific and engineering applications, even more so since the introduction of the Fermi architecture by NVIDIA, with features essential to numerical ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SCOPES '17: Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems
June 2017
100 pages
ISBN:9781450350396
DOI:10.1145/3078659
Editor:
Sander Stuijk
Eindhoven University of Technology, The Netherlands
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automatic tuning
ordinal regression
stencil computations
support vector machines
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
SCOPES '17 Paper Acceptance Rate6of9submissions,67%Overall Acceptance Rate38of79submissions,48%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 77
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Stencil Autotuning with Ordinal Regression: Extended Abstract

SCOPES '17: Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems

ABSTRACT

References

Cited By

Recommendations

Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE

Automatic code generation and tuning for stencil kernels on modern shared memory architectures

Autotuning GEMM Kernels for the Fermi GPU