• No results found

Code generation under Control

N/A
N/A
Protected

Academic year: 2021

Share "Code generation under Control"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

Code generation under Control

Rencontres sur la compilation / Saint Hippolyte

Henri-Pierre Charles CEA Laboratoire LaSTRE / Grenoble

12 décembre 2011

Code generation under Control

Rencontres sur la compilation / Saint Hippolyte

Henri-Pierre Charles CEA Laboratoire LaSTRE / Grenoble

(2)

Introduction

Présentation

Henri-Pierre Charles, two lines CV :

2010- CEA/DRT/DACLE/LIST/LaSTRE CRI PILSI context at Gières

1993-2010 : assistant professor in Université of Versailles Saint-Quentin en Yvelines, PRiSM laboratory, IUT de Vélizy Keywords :

Architecture, HPC, Compiler backend, Parallelism (ILP, Multimedia, Caches)

6809, 68000, i860, trimedia, Itanium, Power, CELL, ARM, MEPHISTO, other

GCC, LLVM, FFTW, H264, Spiral, ATLAS, MESA3D, other 3D Image reconstruction, Z-buffer, Video Compression, FFTW, QCD

(3)

Introduction

CEA / CRI PILSI

CEA : Commissariat à l'Énergie Atomique et aux Énergies Alternatives

DAM : Direction des Applications Militaires

DEN : Direction de l'Énergie Nucléaire DRT : Direction de la Recherche Technologique DSM : Direction des Sciences de la Matière DSV : Direction des Sciences du Vivant LIST : Laboratoire Intégration des Systèmes

et des Technologies SACLAY LETI : Laboratoire Électronique et de Technologie de l'Information Grenoble LITEN : Laboratoire Innovation pour les Technologies des Energies Nouvelles et les nanomatériau

LaSTRE : Laboratoire Système Temps Réel Saclay / Gières

LIALP : Laboratoire Infrastructure et Atelier Logiciel pour Puces

(4)

Introduction

Présentation LaSTRE

Laboratoire Sytèmes Temps Réel : Head : Vincent DAVID

OASIS Multi-scaled time-triggered architecture (the system

is measured at its own rhythm) Temporal consistency of exchanged data

PharOS Same concepts specialized in automotive context :

Embedded Systems Multiprocessors

MPPA High productivity parallel programming model for

embedded HPC : MPPA project P

c

Low Level Code Optimization Dynamic code generation, low level

optimization, multimedia applications

(5)

Motivation Context

Objective ?

Be at home as fast as possible With safety

Speed Limitations Constraints

“Real” Speed Limitations Constraints Gaz Consomption Constraints

(6)

Motivation Context

Classical Compilation Chain

Source

code

Intermediate

code

Compiler

Binary

code

Runnable

code

System

Assembly

code

Assembler Loader User

Data

Idea Algorithm

Programmer

Compilation objectives

Translate source code to a semantically binary equivalent Assume “successive refinement”

Optimize for efficency / parallelism : reduce cycle count Performance defaults is now a “bug” (not only in RT systems) “Performance counter in the loop”

(7)
(8)

Motivation Context

Ask for program !

What are speed variation for this program :

int i;

for (i= 0; i < N; ++i)

{ int j; dest[i]= 0; for (j= 0; j < N; ++j) dest[i] += src[j] * m[i][j]; }

Compiler, data size, target processor, instruction set, available parallelism, data type, memory location, operating system, ...

(9)

Motivation Context

Data Size Matter

Loop size (value ofN)

101 Multimedia kernel : Full loop unroll, instruction

scheduling, memory caches access, ...

102/103/ Scientific code : loop unroll, loop convertion, data prefetching

106 Multimedia flux : multithreading

1010 and more High level parallelism : MPI / Grid / Cloud, ...

N is generally a parameter only known at run-time. Profiling and

Iterative compilation does not help.

Compilation strategies are complex and are application domain specific

(10)

Architecture

Architecture GENEPY

(11)

Architecture

Operateur Mephisto

(12)
(13)

Dynamic compilation

Compilette in work

Source

code

Compiler

Intermediate

code

Assembler

Assembly

code

Loader

Binary

code

System

Runnable

code

Data

User

Idea Algorithm

Programmer

Compilette

Algorithmic optimizer Parameter Code generation

Data Driven (Size, Alignment, Values) Energy Driven (ISA selection, Vectorization) Speed Driven (ISA selection, Vectorization quality) Network Topology driven

(14)

Dynamic compilation deGoal a tool for dynamic code

generation

deGoal : a tool for compilette generation Generate a code generator

Virtual Portable Instruction Set (Register based Data Type) Optimization at compil time & run time

Faster than any compiler code generator No Intermediate representation

Algorithmic level Bottom up approach

Target : ARM, GENEPY, XP70V3/4, GPU, K1, ... Memory footprint : few Kb

(15)

Dynamic compilation

FP7 H4H

FP7 :H4H: High Performance for Heterogenous Architecture,

GPU JIT for Scilab

Generate NVIDIA assembly language PTX dynamically Embed code generator in Scilab

Optimized data movement Linear algebra context

(16)

Dynamic compilation

FP7 Touchmore

FP7 :Touchmore: Dynamic code generation

Dynamic code generation for MpSOC GENEPY tile (DSP Mephisto + MIPS) Generate code for MIPS or Mephisto Multimedia applications (MP3 / MP4)

(17)

Dynamic compilation

Smecy

FP7 :Smecy

Target P2012 MPSoC / XP70 processor Matrix x Matrix dynamic generation “Perfect hash” dynamic generator

(18)

Dynamic compilation

Related work

Jit compilation : Java, LLVM, CUDA : Intermediate

representation, heavy weight code generators (code footprint & time)

Python, perl, php : too high level, glue language FFTW, Spiral : code generator, dynamic configuration Atlas : compil time tuning

(19)

Dynamic compilation

Conclusion

Dynamic code generation is THE challenge (JIT, Javascript, emulation, multicore simulation, ...)

Lot of work to do : power characterization

MPSoC and HPC systems share some problematics : multiple core, power consomption control, ...

Control over parameters for code generation are multiples and hard to manage

References

Related documents

9 In principle, institutional effects can be non-linear in complex ways, some of which we discuss below. However, the comparative finance literature on institutions relies on

In real life, the offset is quite low, especially with amp connected directly to preamp stage with low impedance buffered output. I often measure chips for offset and it's

Gender differences in entrepreneurial intentions and agentic traits frequently linked to entrepreneurship (locus of control, entrepreneurial self-efficacy, risk-taking propensity,

structed similarly. In examining the solutions to the unified dataset, an important consideration is the fit quality of the PMF model to the data from each instrument. When no

When agents’ wealth-weighted average belief about the future short rates is higher than the econometrician’s belief, they would discount bonds more heavily and the equilibrium

There will be an increase increase in entropy during photolysis as the in entropy during photolysis as the number number of particles in gaseous state increase.. of particles in

work is important because it is critical race work—like this Article—that heavily leans on insights from social science research. It is also important in other ways

29 longer be made available to the general public and those rights override the economic interests of the operator of the search engine as well as the interests of the general