(0) items

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.
Advanced Backend Code Optimization,9781848215382
This item qualifies for

Your order must be $59 or more, you must select US Postal Service Shipping as your shipping preference, and the "Group my items into as few shipments as possible" option when you place your order.

Bulk sales, PO's, Marketplace Items, eBooks, Apparel, and DVDs not included.

Advanced Backend Code Optimization



by ;
Pub. Date:
Iste/Hermes Science Pub
List Price: $154.00

Rent Textbook


Buy New Textbook

Usually Ships in 3-4 Business Days

Used Textbook

We're Sorry
Sold Out


We're Sorry
Not Available

More New and Used
from Private Sellers
Starting at $138.46

Questions About This Book?

Why should I rent this book?

Renting is easy, fast, and cheap! Renting from can save you hundreds of dollars compared to the cost of new or used books each semester. At the end of the semester, simply ship the book back to us with a free UPS shipping label! No need to worry about selling it back.

How do rental returns work?

Returning books is as easy as possible. As your rental due date approaches, we will email you several courtesy reminders. When you are ready to return, you can print a free UPS shipping label from our website at any time. Then, just return the book to your UPS driver or any staffed UPS location. You can even use the same box we shipped it in!

What version or edition is this?

This is the edition with a publication date of 6/23/2014.

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any CDs, lab manuals, study guides, etc.
  • The Rental copy of this book is not guaranteed to include any supplemental materials. You may receive a brand new copy, but typically, only the book itself.


This book is a summary of more than a decade of research in the area of backend optimization. It contains the latest fundamental research results in this field. While existing books are often more oriented toward Masters students, this book is aimed more towards professors and researchers as it contains more advanced subjects.
It is unique in the sense that it contains information that has not previously been covered by other books in the field, with chapters on phase ordering in optimizing compilation; register saturation in instruction level parallelism; code size reduction for software pipelining; memory hierarchy effects and instruction level parallelism.
Other chapters provide the latest research results in well-known topics such as register need, and software pipelining and periodic register allocation.

Table of Contents

1 Prologue

1.1 Introduction

2 Phase Ordering in Optimising Compilation

2.1 Introduction to the Phase Ordering Problem

2.2 Background on Phase Ordering

2.2.1 Performance Modelling and Prediction

2.2.2 Some Attempts in Phase Ordering

2.3 Towards a Theoretical Model for Phase Ordering Problem

2.3.1 Decidability Results

2.3.2 Another Formulation of the Phase Ordering Problem

2.4 Examples of Decidable Simplified Cases

2.4.1 Models with Compilation Costs

2.4.2 One-pass Generative Compilers

2.5 Compiler Optimisation Parameters Space Exploration

2.5.1 Towards a Theoretical model

2.5.2 Examples of Simplified Decidable Cases

2.6 Conclusion on Phase Ordering

3 The Register Need

3.1 Data Dependence Graph and Processor Models

3.2 The Acyclic Register Need

3.3 The Periodic Register Need

3.3.1 Software Pipelining, Periodic Scheduling, Cyclic Scheduling

3.3.2 The Circular Lifetime Intervals

3.4 Computing the Periodic Register Need

3.5 Some Results on the Periodic Register Need

3.5.1 Minimal Periodic Register Need vs. Initiation Interval

3.5.2 Computing the Periodic Register Sufficiency

3.5.3 Stage Scheduling under Register Constraints

3.6 Conclusion on the Register Requirement

4 The Register Saturation

4.1 Motivations on the Register Saturation Concept

4.2 Computing the Acyclic Register Saturation

4.2.1 Characterising the Register Saturation

4.2.2 Efficient Algorithmic Heuristic for RS Computation

4.2.3 Experimental Efficiency of Greedy-k

4.3 Computing the Periodic Register Saturation

4.4 Conclusion on the Register Saturation

5 Spill Code Reduction

5.1 Introduction on Register Constraints in Software Pipelining

5.2 Related Work in Periodic Register Allocation

5.3 SIRA: Schedule Independant Register Allocation

5.3.1 Reuse Graphs

5.3.2 DDG Associated to Reuse Graph

5.3.3 Exact SIRA with Integer Linear Programming

5.3.4 SIRA with Fixed Reuse Edges

5.4 SIRALINA: An Efficient Polynomial Heuristic for SIRA

5.5 Experimental Results with SIRA

5.6 Conclusion on Spill Code Reduction

6 Exploiting the Register Access Delays

6.1 Problem Description of DDG Cycles with Non-positive Distances .

6.2 Eliminating Non-Positive Cycles

6.3 Experimental Results on Eliminating Non-Positive Cycles

6.4 Conclusion on Non-Positive Cycles Elimination

7 Loop Unrolling Degree Minimisation

7.1 Introduction

7.2 Unroll Degree Minimisation of Unscheduled Loops

7.2.1 Problem Description of Unroll Factor Minimisation for Unscheduled Loops

7.2.2 Algorithmic Solution for Unroll Factor Minimisation: Single Register Type

7.2.3 Solution for LCM Problem

7.2.4 Unroll Factor Minimisation in Presence of Multiple Register Types

7.2.5 Solution for Minimal Loop Unrolling

7.3 Unroll Degree Minimisation of Scheduled Loops

7.4 Experimental Results

7.5 Conclusion on Loop Unroll Degree Minimisation

8 Memory Hierarchy E↵ects and ILP

8.1 Problem of Memory Disambiguation at Runtime

8.1.1 Introduction

8.1.2 Related Work

8.1.3 Experimental Environment

8.1.4 Experimentation Methodology

8.1.5 Experimental Study of Cache Behavior

8.1.6 The E↵ectiveness of Load/Store Vectorisation

8.1.7 Conclusion on Memory Disambiguation Mechanisms

8.2 Data Preloading and Prefetching

8.2.1 Introduction

8.2.2 Related Work

8.2.3 Problems of Optimising Cache E↵ects at the Instruction Level

8.2.4 Target Processor Description

8.2.5 Our Methodology of Instruction-Level Code Optimisation

8.2.6 Experimental Results

8.2.7 Conclusion on Pre-fetching and Pre-Loading

9 Statistical Performance Analysis

9.1 Code Performance Variation

9.2 The Speedup-Test Protocole

9.2.1 The Observed Speedups

9.2.2 The Speedup of the Observed Average Execution Time

9.2.3 The Speedup of the Observed Median Execution Time, as well as Individual Runs

9.3 Discussion and Conclusion on the Speedup-Test

10 Epilogue

10.1 Problem of Instruction Selection

10.2 Perspectives on Code Optimisation for Multi-Core Processors

10.3 General Conclusion

A Benchmarks Presentation 125

A.1 Qualitative Benchmarks Presentation

A.2 Quantitative Benchmarks Presentation

A.3 Changing the Architectural Configuration of the Processor

B Experiments on Register Saturation

B.1 The Acyclic Register Saturation

B.1.1 On the Oprimal RS Computation

B.1.2 On the Accuracy of Greedy-k Heuristic vs. Optimal RS

B.1.3 Greedy-k Execution Times

B.2 The Periodic Register Saturation

B.2.1 Optimal PRS Computation

B.2.2 Approximate PRS Computation with Heuristic

C Experiments on SIRA

C.1 Efficiency of SIRALINA on Standalone DDG

C.1.1 Naming conventions for register optimisation orders

C.1.2 Experimental efficiency of SIRALINA

C.1.3 Measuring the Increase of the MII

C.1.4 Efficiency of SIRALINA Execution Times

C.2 Efficiency of SIRALINA plugged Inside Industrial Compiler

C.2.1 Static Performance Results

C.2.2 Execution Time Performance Results

D Experiments on Non-Positive Cycles Elimination 151

D.1 Experimental Setup

D.1.1 Heuristics Nomenclature

D.1.2 Empirical Efficiency Measures

D.2 Comparison of the Heuristics Execution Times

D.2.1 Time to Minimise Register Pressure for a fixed II

D.3 Convergence of the Proactive Heuristic (Iterative SIRALINA)

D.4 Qualitative Analysis of the Heuristics

D.4.1 Number of Saved Registers

D.4.2 Proportion of Success

D.4.3 Increase of the MII when Success

D.5 Conclusion on Non-Positive Cycles Elimination Strategy

E Experiments on Unroll Degree Minimisation

E.1 Standalone Experiments with Single Register types

E.1.1 Experiments with Unscheduled Loops

E.1.2 Results on Randomly Generated DDG

E.1.3 Experiments on Real DDG

E.1.4 Experiments with Scheduled Loops

E.2 Experiments with Multiple Register Types

F Experiments on Preloading and Prefetching

Please wait while the item is added to your cart...