Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 122 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
122
Dung lượng
2,44 MB
Nội dung
Technical Report No. 06-18
Proceedings of the
Second InternationalWorkshop on
Library-Centric Software Design
(LCSD '06)
JOSHUA BLOCH
JAAKKO JÄRVI (PROGRAM CO-CHAIRS)
ANDREAS PRIESNITZ
SIBYLLE SCHUPP (PROCEEDINGS EDITORS)
Department of Computer Science and Engineering
Division of Computing Science
CHALMERS UNIVERSITY OF TECHNOLOGY/
GÖTEBORG UNIVERSITY
Göteborg, Sweden, 2006
Smith Nguyen Studio.
Technical Report in Computer Science and Engineering at
Chalmers University of Technology and G¨oteborg University
Technical Report No. 06-18
ISSN: 1652-926X
Department of Computer Science and Engineering
Chalmers University of Technology and G¨oteborg University
SE-412 96 G¨oteborg, Sweden
G¨oteborg, Sweden, October 2006
Smith Nguyen Studio.
Proceedings oftheSecondInternationalWorkshop on
Library-Centric Software Design
(LCSD ’06)
An OOPSLA Workshop
October 22, 2006
Portland, Oregon, USA
Joshua Blo ch and Jaakko J¨arvi (Program Co-Chairs)
Andreas Priesnitz and Sibylle Schupp (Proceedings Editors)
Chalmers University of Technology
Computer Science and Engineering Department
Technical Report 06-18
Smith Nguyen Studio.
Smith Nguyen Studio.
Foreword
These proceedings contain the papers selected for presentation at theworkshopLibrary-Centric Software
Design (LCSD), held on October 22nd, 2006 in Portland, Oregon, USA, as part ofthe yearly ACM
OOPSLA conference. The current workshop is thesecond LCSD workshop in the series. The first ever
LCSD workshop in 2005 was a success—we are thus very pleased to see that interest towards the current
workshop was even higher.
Software libraries are central to all major scientific, engineering, and business areas, yet the design,
implementation, and use of libraries are underdeveloped arts. The goal oftheLibrary-Centric Software
Design workshop therefore is to place the various aspects of libraries on a sound technical and scientific
footing. To that end, we welcome both research into fundamental issues and the documentation of best
practices. The idea for a workshoponLibrary-CentricSoftwareDesign was born at the Dagstuhl meeting
Software Libraries: Design and Evaluation in March 2005. Currently LCSD has a steering committee
developing theworkshop further, and coordinating the organization of future events. The committee is
currently served by Josh Bloch, Jaakko J¨arvi, Sibylle Schupp, Dave Musser, Alex Stepanov, and Frank
Tip. We aim to keep LCSD growing.
For the current workshop, we received 20 submissions, nine of which were accepted as technical
papers, and additional four as position papers. The topics ofthe papers covered a wide area of the
field ofsoftware libraries, including library evolution; abstractions for generic manipulation of complex
mathematical structures; static analysis and type systems for software libraries; extensible languages;
and libraries with run-time code generation capabilities. All papers were reviewed for soundness and
relevance by three or more reviewers. The reviews were very thorough, for which we thank the members
of the program committee. In addition to paper presentations, workshop activities included a keynote by
Sean Parent, Adobe Inc. At the time of writing this foreword, we do not yet know the exact attendance
of the workshop; the registrations received suggest close to 50 attendees.
We thank all authors, reviewers, and the organizing committee for their work in bringing about the
LCSD workshop. We are very grateful to Sibylle Schupp, David Musser, and Jeremy Siek for their efforts
in organizing the event, as well as to DongInn Kim and Andrew Lumsdaine for hosting the CyberChair
system to manage the submissions. We also thank Tim Klinger and the OOPSLA workshop organizers
for the help we received.
We hope you enjoy the papers, and that they generate new ideas leading to advances in this exciting
field of research.
Jaakko J¨arvi
Joshua Bloch
(Program co-chairs)
1
Smith Nguyen Studio.
Organization
Workshop Organizers
- Josh Bloch, Google Inc.
- Jaakko J¨arvi, Texas A&M University
- David Musser, Rensselaer Polytechnic Institute
- Sibyl le Schupp, Chalmers University of Technology
- Jeremy Siek, Rice University
Program Committee
- Dave Abrahams, Boost Consulting
- Olav Beckman, Imperial College London
- Herv´e Br¨onnimann, Polytechnic University
- Cristina Gacek, University of Newcastle upon Tyne
- Douglas Gregor, Indiana University
- Paul Kelly, Imperial College London
- Doug Lea, State University of New York at Oswego
- Andrew Lumsdaine, Indiana University
- Erik Meijer, Microsoft Research
- Tim Peierls, Prior Artisans LLC
- Doug Schmidt, Vanderbilt University
- Ant hony Simons, University of Sheffield
- Bjarne Stroustrup, Texas A&M University and AT&T Labs
- Todd Veldhuizen, University of Waterloo
2
Smith Nguyen Studio.
Contents
Active Libraries 5
An Active Linear Algebra Library Using Delayed Evaluation and Runtime Code Gen-
eration
Francis P. Russell, Michael R. Mellor, Paul H. J. Kelly,
and Olav Beckmann 5
Efficient Run-Time Dispatching in Generic Programming with Minimal Code Bloat
Lubomir Bourdev and Jaakko J¨arvi 15
Generic Library Extension in a Heterogeneous Environment
Cosmin Oancea and Stephen M. Watt 25
Adding Syntax and Static Analysis to Libraries via Extensible Compilers and Lan-
guage Extensions
Eric Van Wyk, Derek Bodin, and Paul Huntington 35
Typ e Systems and Static Analysis 45
A Static Analysis for the Strong Exception-Safety Guarantee
Gustav Munkby and Sibylle Schupp 45
Extending Type Systems in a Library
Yuriy Solodkyy, Jaakko J¨arvi, and Esam Mlaih 55
Anti-Deprecation: Towards Complete Static Checking for API Evolution
S. Alexander Spoon 65
Libraries Manipulating Complex Structures 75
A Generic Lazy Evaluation Scheme for Exact Geometric Computations
Sylvain Pion and Andreas Fabri 75
A Generic Topology Library
Ren´e Heinzl, Michael Spe vak, and Philipp Schwaha 85
Position Papers 95
A Generic Discretization Library
Michael Spevak, Ren´e Heinzl, and Philipp Schwaha 95
The SAGA C++ Reference Implementation
Hartmut Kaiser, Andre Merzky, Stephan Hirmer, and Gabrielle Allen 101
3
Smith Nguyen Studio.
A Parameterized Iterator Request Framework for Generic Libraries
Jacob Smith, Jaakko J¨arvi, and Thomas Ioerger 107
Pound Bang What?
John P. Linderman 113
4
Smith Nguyen Studio.
An Active Linear Algebra Library Using Delayed Evaluation
and Runtime Code Generation
[Extended Abstract]
Francis P Russell, Michael R Mellor, Paul H J Kelly and Olav Beckmann
Department of Computing
Imperial College London
180 Queen’s Gate, London SW7 2AZ, UK
ABSTRACT
Active libraries can be defined as libraries which play an ac-
tive part in the compilation (in particular, the optimisation)
of their client code. This paper explores the idea of delay-
ing evaluation of expressions built using library calls, then
generating code at runtime for the particular compositions
that occur. We explore this idea with a dense linear algebra
library for C++. The key optimisations in this context are
loop fusion and array contraction.
Our library automatically fuses loops, identifies unnecessary
intermediate temporaries, and contracts temporary arrays
to scalars. Performance is evaluated us ing a benchmark
suite of linear solvers from ITL (the Iterative Template Li-
brary), and is compared with MTL (the Matrix Template Li-
brary). Excluding runtime compilation overheads (caching
means they occur only onthe first iteration), for larger ma-
trix sizes, performance matches or exceeds MTL – and in
some cases is more than 60% faster.
1. INTRODUCTION
The idea of an “active library” is that, just as the library
extends the language available to the programmer for prob-
lem solving, s o the library should also extend the compiler.
The term was coined by Czarnecki et al [5], who observed
that active libraries break the abstractions common in con-
ventional compilers. Active libraries are described in detail
by Veldhuizen and Gannon [8].
This paper presents a prototype linear algebra library which
we have developed in order to explore one interesting ap-
proach to building active libraries. The idea is to use a
combination of delayed evaluation and runtime code gener-
ation to:
Delay library call execution Calls made to the library
are used to build a “recipe” for the delayed computa-
tion. When execution is finally forced by the need for
a result, the recipe will commonly represent a complex
composition of primitive calls.
Generate optimised code at runtime Code is generated
at runtime to perform the operations present in the de-
layed recipe. In order to obtain improved performance
over a conventional library, it is important that the
generated code should on average, execute faster than
a statically generated counterpart in a conventional li-
brary. To achieve this, we apply optimisations that
exploit the structure, semantics and context of each
library call.
This approach has the advantages that:
• There is no need to analyse the client source code.
• The library user is not tied to a particular compiler.
• The interface ofthe library is not over complicated by
the concerns of achieving high performance.
• We can perform optimisations across both statement
and procedural bounds.
• The code generated for a recipe is isolated from client-
side code - it is not interwoven with non-library code.
This last point is particularly important, as we shall see:
because the structure ofthe code for a recipe is restricted in
form, we can introduce compilation passes sp ecially targeted
to achieve particular effects.
The disadvantage of this approach is the overhead of run-
time compilation and the infrastructure to delay evaluation.
In order to minimise the first factor, we maintain a cache of
previously generated code along with the recipe used to gen-
erate it. This enables us to reuse previously optimised and
compiled code when the same recipe is encountered again.
5
Smith Nguyen Studio.
There are also more subtle disadvantages. In contrast to
a compile-time solution, we are forced to make online de-
cisions about what to evaluate, and when. Living without
static analysis ofthe client code means we don’t know, for
example, which variables involved in a recipe are actually
live when the recipe is forced. We return to these issues
later in the paper.
Our exploration covers the following ground:
1. We present an implementation of a C++ library for
dense linear algebra which provides functionality suf-
ficient to operate with the majority of methods avail-
able in the Iterative Template Library [6] (ITL), a set
of templated linear iterative solvers for C++.
2. This implementation delays execution, generates code
for delayed recipes at runtime, and then invokes a ven-
dor C compiler at runtime - entirely transparently to
the library user.
3. To avoid repeated compilation of recurring recipes, we
cache compiled code fragments (see Section 4).
4. We implemented two optimisation passes which trans-
form the code prior to compilation: loop fusion, and
array contraction (see Section 5).
5. We introduce a scheme to predict, statistically, which
intermediate variables are likely to be used after recipe
execution; this is used to increase opportunities for
array contraction (see Section 6).
6. We evaluate the effectiveness ofthe approach using a
suite of iterative linear system solvers, taken from the
Iterative Template Library (see Section 7).
Although the exploration of these techniques has used only
dense linear algebra, we believe these techniques are more
widely applicable. Dense linear algebra provides a simple
domain in which to investigate, understand and demon-
strate these ideas. Other domains we believe may benefit
from these techniques include sparse linear algebra and im-
age processing operations.
The contributions we make with this work are as follows:
• Compared to the widely used Matrix Template Li-
brary [7], we demonstrate performance improvements
of up to 64% across our benchmark suite of dense linear
iterative solvers from the Iterative Template Library.
Performance depends on platform, but on a 3.2GHz
Pentium 4 (with 2MB cache) using the Intel C Com-
piler, average improvement across the suite was 27%,
once cached complied code was available.
• We present a cache architecture that finds applicable
pre-compiled code quickly, and which supports anno-
tations for adaptive re-optimisation.
• Using our experience with this library, we discuss some
of thedesign issues involved in using the delayed-evaluation,
runtime code generation technique.
We discuss related work in Section 8.
Figure 1: An example DAG. The rectangular node
denotes a handle held by the library client. The
expresssion represents the matrix-vector multiply
function from Level 2 BLAS, y = αAx + βy.
2. DELAYING EVALUATION
Delayed evaluation provides the mechanism whereby we col-
lect the sequences of operations we wish to optimise. We call
the runtime information we obtain about these operations
runtime context information.
This information may consist of values such as matrix or
vector sizes, or the various relationships between successive
library calls. Knowledge of dynamic values such as matrix
and vector sizes allows us to improve the performance of
the implementation of operations using these objects. For
example, the runtime code generation system (see 3) can
use this information to specialise the generated code. One
specialisation we do is with loop b ounds. We incorporate dy-
namically known sizes of vectors and matrices as constants
in the runtime generated code.
Delayed evaluation in the library we developed works as fol-
lows:
• Delayed expressions built using library calls are repre-
sented as Directed Acyclic Graphs (DAGs).
• Nodes in the DAG represent either data values (liter-
als) or operations to be performed on them.
• Arcs in the DAG point to the values required before a
node can be evaluated.
• Handles held by the library client may also hold refer-
ences to nodes in the expression DAG.
• Evaluation ofthe DAG involves replacing non-literal
nodes with literals.
• When a node no longer has any nodes or handles de-
pending on it, it deletes itself.
6
Smith Nguyen Studio.
[...]... with the result that the matrix involved was only iterated over once for both operations A graph ofthe speedup obtained across matrix sizes is shown in Figure 2 Thesecond optimisation implemented was array contraction We only evaluated this in the presence of loop fusion as the former is often facilitated by the latter The array contraction pass did not show any noticeable improvement on any of the. .. accessed via the getErasedSTL function in the form of an unsigned long value The implementation ofthe erase function retrieves the STL objects corresponding to the GIDL wrapper parameters, calls the STL erase function onthe STL vector reference, and creates a new GIDL server corresponding to the iterator result Note that the semantics ofthe erase function are irrelevant in what the translation mechanism... depend onthe types ofthe elements contained in these containers, a high-quality implementation is expected to hoist this functionality to non-generic functions The GNU Standard C++ Library v3 does exactly this: the tree balancing functions operate on pointers to a non-generic base class ofthe tree’s node type In the case of associative containers, the tree node type is split into a generic and non-generic... extension of multiple, independent dimensions ofthe library’s behavior In this situation, there are questions of how the extended library’s hierarchy relates to the original library’s hierarchy, how objects from independent extensions may be used and how the extensions interact This paper examines the question of library extension in a heterogeneous environment We consider the situation where software. .. properties ofthe extension: • The extension interface should be type-precise and it should allow type-safety reasoning with respect to the extension itself The type-safety result for the whole framework would thus be derived from the ones ofthe extensions and ofthe underlying architecture • The extension should be split in first-class value components In the GIDL case for example, one component should... implementation retrieves the parameters’ UA-objects, invokes the UA method on these, and perform the reverse operation onthe result The wrapper skeleton functionality is the inverse ofthe client The wrapper skeleton method creates GIDL stub wrapper objects encapsulating the UA objects, thus recovering the generic type erased information It then invokes the user-implemented server method with these parameters,... for the application of a convolution filter to an image As the size and the values ofthe convolution matrix are known at the runtime code generation stage, the two inner loops of the convolution can be unrolled and specialised with the values ofthe matrix elements Another example shows how a runtime search can be performed to find an optimal tile size for a matrix multiply TaskGraph is also used as the. .. the STL orthogonal designof its domains For example GIDL iterators are themselves valid STL iterators and thus they can be manipulated by the STL containers and algorithms In this context we investigate the issues that prevent the translation to conform with the library semantics, the techniques to amend them, and the tradeoffs between translation ease -of- use and performance Thesecond objective was... comparison ofthe BiConjugate Gradient solver against MTL running on architecture 2 is shown in Figure 4 In the figures just quoted, we excluded the runtime compilation overhead, leaving just the performance increase in the numerical operations As the iterative solvers use code caching, the runtime compilation overhead is independent ofthe number of iterations executed Depending onthe number of iterations... retrieves the UA IDL-object or value ofthe result and passes it to the IDL skeleton The extension introduces an extra level of indirection with respect to the method invocation mechanism ofthe underlying framework This is the price to pay for the generality ofthe approach: this generic extension will work on top of any UA vendor implementation while maintaining backward compatibility However, since the . Studio.
Proceedings of the Second International Workshop on
Library-Centric Software Design
(LCSD ’06)
An OOPSLA Workshop
October 22, 2006
Portland, Oregon,. Technical Report No. 06-18
Proceedings of the
Second International Workshop on
Library-Centric Software Design
(LCSD '06)
JOSHUA BLOCH
JAAKKO