• No results found

Linking Programming models between Grids, Web 2 0 and Multicore

N/A
N/A
Protected

Academic year: 2020

Share "Linking Programming models between Grids, Web 2 0 and Multicore"

Copied!
47
0
0

Loading.... (view fulltext now)

Full text

(1)

1

Linking Programming models

between Grids, Web 2.0 and

Multicore

Distributed Programming Abstractions

Workshop NESC Edinburgh UK

May 31 2007

Geoffrey Fox

Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401

(2)

2

Points in Talk I

n All parallel programming projects more or less fail n All distributed programming projects report success

There are several hundred in Grid workflow area alone

n Few constraints on distributed programming

n Composition (in distributed computing) v decomposition (in parallel

computing)

n There is not much difference between distributed programming and a key

paradigm of parallel computing (functional parallelism)

n Pervasive use of64 core chips in the future will often require one to build a Grid on a chip i.e. to execute a traditional distributed application on a chip

n XML is a pretty dubious syntax for expressing programs

n Web 2.0 is pretty scruffy but there are some large companies and many users

behind it.

n Web 2.0 and Grids will converge and features of both will survive or disappear

in merged environment

n Web 2.0 has a more plausible approach to distributed programming than Web

Services/Grids

(3)

Some More points

n Services could be universal abstraction in parallel and distributed computing

Whereas objects could not be universal so perhaps should move away from their use

n Gateways/Portals (Portlets, Widgets, Gadgets) are natural user (application usage)

interface to a collection of services

n Important Data (SQL, WFS, RSS Feeds) abstractions

n Divide Parallel Programming Run-time (matching application structure) into 3 or 4

Broad classes

n Inter-entity communication time characteristic of different programming model

1-5 µs for MPI/Thread switching to 1-1000 milliseconds for services on the Grid and 25 µs

for services inside a chip

n Multicore Commodity Programming Model

Marine corps write librariesin “HLA++”, MPI or dynamic threads (internally one

microsecond latency)expressed as services

Services composed/mashuped by “millions”

n Many composition (coordination) or mashup approaches

Functional (cf. Google Map Reduce for data transformations)

Dataflow

Workflow

Visual

Script

n The difficulties of making effective use of multicore chips will so great that it will be

main driver of new programming environments

n Microsoft CCR DSS is good example of unification of parallel and distributed

(4)

Some Details

n

See

http://www.slideshare.net/Foxsden

or more

conventionally

n

Web 2.0 and Grid

Tutorial

ht

tp://grids.ucs.indiana.edu/ptliupages/presentations/CTSpar

tIMay21-07.ppt

htt

p://grids.ucs.indiana.edu/ptliupages/presentations/Web20T

utorial_CTS.ppt

n

Multicore and Parallel Computing

Tutorial

http:

//grids.ucs.indiana.edu/ptliupages/presentations/PC2007/

index.html

n

“Web 2

.0”

citation site

(5)

Web 2.0 and Web Services I

n Web Services have clearly defined protocols (SOAP) and a well defined

mechanism (WSDL) to define service interfaces

There is good .NET and Java support

The so-called WS-* (WS-Nightmare) specifications provide a rich

sophisticated but complicated standard set of capabilities for security, fault tolerance, meta-data, discovery, notification etc.

n “Narrow Grids” build on Web Services and provide a robust managed

environment with growing adoption in Enterprise systems and distributed science (e-Science)

n We can use the term Grids strictly as Narrow Grids that are collections of

Web Services (or even more strictly OGSA Grids) or just call any collections of services as “Broad Grids” which actually is quite often done

n Web 2.0 supports a similar architecture to Web services but has developed in

a more chaotic but remarkably successful fashion with a service architecture with a variety of protocols including those of Web and Grid services

Over 400 Interfaces defined at http://www.programmableweb.com/apis

n One can easily combine SOAP (Web Service) based services/systems with

(6)

Web 2.0 and Web Services II

n

Web 2.0

also has many well known capabilities with

Google

Maps

and

Amazon Compute/Storage services

of clear general

relevance

n

There are also

Web 2.0 services

supporting novel collaboration

modes and user interaction with the web as seen in social

networking sites and portals such as: MySpace, YouTube,

Connotea, Slideshare ….

n

I once thought

Web Services were inevitable

but this is no longer

clear to me

n

Web services are

complicated

,

slow

and

non functional

WS-Security

is unnecessarily slow and pedantic

(canonicalization of XML)

WS-RM

(Reliable Messaging) seems to have poor adoption

and doesn’t work well in collaboration

WSDM

(distributed management) specifies a lot

(7)

Attack of the Killer Multicores

n

Today

commodity Intel systems

are sold with

8 cores

spread over

two processors

n

Specialized chips such as

GPU

’s and

IBM Cell

processor have

substantially more cores

n

Moore’s Law

implies and will be satisfied by and imply

exponentially increasing number of cores doubling every 1.5-3

Years

Modest increase in clock speed

Intel has already prototyped a 80 core Server chip ready in

2011?

n

Huge activity in

parallel computing

programming (recycled from

the past?)

Some

programming models

and

application styles

similar to

Grids

n

We will have a

Grid on a chip

……….

(8)

Grids meet Multicore Systems

n

The expected rapid growth in the number of cores per chip has

important implications for Grids

n

With 16-128 cores on a single commodity system 5 years from

now one will both be able to build a

Grid like application on a

chip

and indeed must build such an application to get the

Moore’s law performance increase

Otherwise you will “waste” cores …..

n

One

will not want to reprogram

as you move your application

from a 64 node cluster or transcontinental implementation to a

single chip Grid

n

However multicore chips have a very

different architecture

from

Grids

Shared not Distributed Memory

Latencies measured in microseconds not milliseconds

n

Thus

Grid and multicore technologies will need to “converge”

and converged technology model will have different

requirements from current Grid assumptions

(9)

Grid versus Multicore Applications

n

It seems likely that

future multicore applications

will

involve a loosely coupled mix of multiple modules that

fall into three classes

Data access/query/store

Analysis and/or simulation

User visualization and interaction

n

This is

precisely mix that Grids support

but Grids of

course involve distributed modules

n

Grids and Web 2.0 use

service oriented architectures

to

describe system at module level – is this appropriate

model for multicore programming?

n

Where do multicore systems get their data from

?

(10)

Pradeep K. Dubey, pradeep.dubey@intel.com 10

Tomorrow

What is …? What

if …? Is it …?

Recognition Mining Synthesis

Create a model instance

RMS: Recognition Mining Synthesis

Model-based multimodal

recognition

Find a model instance Model

Real-time analytics on dynamic, unstructured, multimodal datasets Photo-realism and physics-based animation Today

Model-less Real-time streaming and transactions on

static – structured datasets

Very limited realism

Intel has probably most sophisticated analysis of

future “killer” multicore applications –

(11)

Pradeep K. Dubey, pradeep.dubey@intel.com 11

What is a tumor? Is there a tumor here? What if the tumor progresses?

It is all about dealing efficiently with complex multimodal datasets

R

ecognition

M

ining

S

ynthesis

(12)
(13)

Role of Data in Grid/Multicore I

n

One typically is told to place compute (

analysis

) at the

data

but most of the computing power is in

multicore

clients

on the edge

n

These

multicore clients

can get data from the internet

i.e. distributed sources

This could be personal interests of client and used by client to

help user interact with world

It could be

cached or copied

It could be a

standalone

calculation or part of a

distributed

coordinated

computation (SETI@Home)

n

Or

they could get data from set of

local sensors

(video-cams and environmental sensors) naturally stored on

client or locally to client

(14)

Role of Data in Grid/Multicore

n

Note that as you increase sophistication of data

analysis, you increase ratio of compute to I/O

Typical modern datamining approach like

Support Vector

Machine

is sophisticated (dense) matrix algebra and not just

text matching

http://grids.ucs.indiana.edu/ptliupages/presentations/PC2007/PC07BYOPA.ppt

n

Time complexity

of Sophisticated

data analysis

will

make it more attractive to fetch data from the Internet

and cache/store on client

It will also help with memory bandwidth problems in

multicore chips

n

In this vision, the

Grid “just” acts as a source of data

and the Grid application runs locally

(15)

PC07Intro gcf@indiana.edu 15

Multicore Programming Paradigms

At a very high level, there are

three or four broad classes

of

parallelism

Coarse grain functional parallelism

typified by workflow

and often used to build composite “metaproblems” whose

parts are also parallel

“Compute-File”, Database/Sensor, Community, Service, Pleasing

Parallel (Master-worker) are sub-classses

Large Scale loosely synchronous data parallelism

where

dynamic irregular work has clear synchronization points as

in most large scale scientific and engineering problems

Fine grain (asynchronous) thread parallelism

as used in

search algorithms which are often data parallel (over

choices) but don’t have universal synchronization points

(16)

Data Parallel Time Dependence

A simple form of data parallel applications are synchronous with all elements of the application space being evolved with essentially the same instructions

Such applications are suitable for SIMD computers and run well on vector supercomputers (and GPUs but these are more general than just

synchronous)

However synchronous applications also run fine on MIMD machines

SIMD CM-2 evolved to MIMD CM-5 with same data parallel language

CMFortran

The iterative solutions to Laplace’s equation are synchronous as are many full matrix algorithms

Synchronization on MIMD machines is accomplished by messaging

(17)

Local Messaging for Synchronization

MPI_SENDRECV is typical primitive

Processors do a send followed by a receive ora receive followed by a send

In two stages (needed to avoid race conditions), one has a complete left shift

Often follow by equivalent right shift, do get a complete exchange

This logic guarantees correctly updated data is sent to processors that have their data at same simulation time

………

8 Processors

(18)

Loosely Synchronous Applications

This is

most common

large scale science and engineering

and one has the traditional data parallelism but now

each data point has in general a different update

Comes from

heterogeneity

in problems that would be

synchronous if homogeneous

Time steps typically uniform but sometimes need to support variable time steps across application space – however ensure small time steps aret = (t1

-t0)/Integer so subspaces with finer time steps do synchronize with full domain

The time synchronization via messaging is still valid

However one no longer load

balances (ensure each processor does equal work in each time step) by putting equal number of points in each processor

Load balancing although NP complete is in practice

surprisingly easy Application Time Application Space t 0 t 1 t 2 t 3 t 4

(19)

MPI Futures?

MPI

likely to become

more important

as

multicore systems become more common

Should

use MPI when MPI needed

and use other

messaging for other cases (such as linking

services) where different features/performance

appropriate

MPI has

too many primitives

which will

handicap broad implementation/adoption

Perhaps only have

one collective primitive

like

(20)

Fine Grain Dynamic Applications

Here there is no natural universal ‘time’ as there is in science

algorithms where an iteration number or Mother Nature’s time

gives global synchronization

Loose (zero) coupling or special features

of application needed for successful

parallelization

In computer chess, the minimax scores

at parent nodes provide multiple

dynamic synchronization points

Application Time

Application Space

Application Space

(21)

Computer Chess

Thread level parallelism unlike position evaluation parallelism used in other systems

Competed with poor reliability and results in 1987 and 1988 ACM Computer Chess

Championships

(22)

Discrete Event Simulations

These are familiar in military and circuit (system) simulations

when one uses macroscopic approximations

Also probably paradigm of most multiplayer Internet games/worlds

Note Nature is perhaps

synchronous

when viewed quantum

mechanically in terms of uniform fundamental elements (quarks

and gluons etc.)

It is

loosely synchronous

when considered in terms of particles

and mesh points

It is

asynchronou

when viewed in

terms of tanks

people, arrows etc.

Circuit simulations

can be done loosel

synchronously but

inefficient as many

inactive elements

(23)

Programming Models

The three major models are supported by

HPCS languages

which

are very interesting but too monolithic

So the

Fine grain thread parallelism

and

Large Scale loosely

synchronous data parallelism

styles are distinctive to parallel

computing while

Coarse grain functional parallelism

of multicore overlaps with

workflows

from Grids and

Mashups

from Web 2.0

Seems plausible that a more

uniform approach

evolve for coarse

grain case although this is

least constrained

of programming

styles as typically latency issues are not critical

Multicore would have strongest performance constraints

Web 2.0 and Multicore the most important usability constraints

A possible model for broad use of multicores is that the difficult

parallel algorithms are coded as libraries (

Fine grain thread

parallelism

and

Large Scale loosely synchronous data parallelism

(24)

PC07Intro gcf@indiana.edu 24

Google MapReduce

Simplified Data Processing on Large Clusters

http://labs.google.com/papers/mapreduce.html

This is a dataflow model between services where services can do useful

document oriented data parallel applications including reductions

The decomposition of services onto cluster engines is automated

The large I/O requirements of datasets changes efficiency analysis in favor of dataflow

Services (count words in example) can obviously be extended to general parallel applications

There are many alternatives to language expressing either dataflow and/or

(25)

Programming Models

The services and objects in distributed

computing are usually “natural” (come from

application) whereas

parts connected by MPI

(or

created by parallelizing compiler) come from

“artificial” decompositions and not naturally

considered services

Services in multicore

(parallel computing) are

original modules before decomposition and its

these modules that

coarse grain functional

parallelism

addresses

Most of “

difficult

” issues in parallel computing

(26)

Parallel Software Paradigms: Top Level

In the conventional

two-level Grid/Web Service

programming model

, one programs each

individual service and then separately programs

their interaction

This is Grid-aware Services programming model

SAGA supports Grid-aware programs?

This is generalized to multicore with “Marine

Corps” programming services for “difficult”

cases

Loosely Synchronous

Fine Grain threading

Discrete Event Simulation

“Average” Programmer produces mashups or

(27)

The Marine Corps Lack of

Programming Paradigm Library Model

One could assume that parallel computing is “just too hard

for real people” and assume that we use a

Marine Corps of

programmers

to build as

libraries

excellent parallel

implementations of “all” core capabilities

e.g. the primitives identified in the

Intel application

analysis

e.g. the primitives supported in Google

MapReduce

,

HPF

,

PeakStream

,

Microsoft Data Parallel .NET

etc.

These primitives are orchestrated (linked together) by

overall frameworks

such as workflow or

mashups

The Marine Corps probably is content with

efficient

rather

(28)

Component Parallel and Program Parallel

Component parallel

paradigm is where one

explicitly programs

the different parts of a parallel application

with the linkage either

specified externally as in workflow or in components themselves

as in most other component parallel approaches

In Grids, components are natural

In Parallel computing, components are produced by decomposition

In the

program parallel

paradigm, one writes a single program to

describe the whole application and some combination of compiler

and runtime

breaks up the program

into the multiple parts that

execute in parallel

Note that a

program parallel approach

will often call a built in

runtime

library

written in

component parallel fashion

A parallelizing compiler could call an MPI library routine

(29)

Component Parallel and Program Parallel

Program Parallel

approaches include

Data structure parallel

as in

Google MapReduce

,

HPF

(High

Performance Fortran),

HPCS

(High-Productivity Computing

Systems) or “SIMD”

co-processor

languages (

PeakStream

,

ClearSpeed

and Microsoft Data Parallel .NET)

Parallelizing compilers

including

OpenMP

annotation

Note OpenMP and HPF have failed in some sense for large scale

parallel computing (writing algorithm in standard sequential

languages throws away information needed for parallelization)

Component Parallel

approaches include

MPI

(and related systems like

PVM

) parallel message passing

PGAS

(Partitioned Global Address Space

CAF, UPC, Titanium,

HPJava

)

C++ futures

and

active

objects

CSP … Microsoft

CCR

and

DSS

Workflow

and

Mashups

(30)

Why people like MPI!

Jason J Beech-Brandt, and Andrew A. Johnson, at AHPCRC

Minneapolis

BenchC

is unstructured finite element CFD Solver

Looked at

OpenMP

on

shared memor

Altix with som

effort to

optimize

Optimized UP

on severa

machines

MPI always goo

but other

approache

erratic

Other studies

reach similar

conclusions?

cluste r

After Optimization of UPC

(31)

Web 2.0 Systems are Portals, Services, Resources

n

Captures the incredible development of interactive

Web sites enabling people to create and collaborate

(32)

32

Mashups v Workflow?

n Mashup Tools are reviewed at http://blogs.zdnet.com/Hinchcliffe/?p=63 n Workflow Tools are reviewed by Gannon and Fox

http://grids.ucs.indiana.edu/ptliupages/publications/Workflow-overview.pdf

n

Both include

scripting

in PHP,

Python, sh etc. as

both implement

distributed

programming at level

of services

n

Mashups

use all

types of service

interfaces and do not

have the potential

robustness

(security)

of Grid service

approach

(33)

Web 2.0 APIs

http://www.programmable

web.com/apis

has (May 14

2007) 431 Web 2.0 APIs

with GoogleMaps the most

often used in Mashups

This site acts as a “

UDDI

(34)

The List of

Web 2.0 API’s

Each site has API and

its features

Divided into broad

categories

Only a few used a lot

(

42 API’s

used in

more than

10

mashups

)

RSS feed of new APIs

Amazon S3 growing

(35)

APIs/Mashups per Protocol Distribution

REST SOAP XML-RPC REST,

XML-RPC XML-RPC,REST, SOAP

REST,

SOAP JS Other

(36)

4 more

Mashups each

day

For a total of

1906

April 17 2007 (4.0 a

day over last

month)

Note ClearForest

runs

Semantic Web

Services Mashup

competitions (not

workflow

competitions)

Some

Mashup

types

: aggregators,

search aggregators,

visualizers, mobile,

maps, games

(37)

Implication for Grid Technology

of Multicore and Web 2.0 I

n

Web 2.0 and Grids are addressing a

similar application

class

although Web 2.0 has focused on user interactions

So technology has similar requirements

n

Multicore differs significantly from Grids in

component location and this seems

particularly

significant for data

Not clear therefore how similar applications will be

Intel RMS multicore application class

pretty similar

to Grids

n

Multicore has more stringent software requirements

than Grids as latter has intrinsic network overhead

(38)

Implication for Grid Technology

of Multicore and Web 2.0 II

n

Multicore chips require

low overhead protocols

to

exploit low latency that suggests

simplicity

We need to simplify

MPI

AND

Grids!

n

Web 2.0 chooses

simplicity

(REST rather than SOAP)

to

lower barrier

to everyone participating

n

Web 2.0 and Multicore tend to use

traditional (possibly

visual) (scripting) languages

for equivalent of workflow

whereas Grids use

visual interface backend recorded in

BPEL

Google MapReduce

illustrates a popular Web 2.0

and Multicore approach to dataflow

(39)

Implication for Grid Technology

of Multicore and Web 2.0 III

n

Web 2.0 and Grids both use

SOA Service Oriented

Architectures

Seems likely that

Multicore will also adopt

although a more

conventional object oriented approach also possible

Services should help multicore applications integrate

modules from different sources

Multicore will use fine grain objects but coarse grain

services

n

“System of Systems”:

Grids, Web 2.0 and Multicore are likely

to build systems hierarchically out of smaller systems

We need to support Grids of Grids, Webs of Grids, Grids

of Multicores etc. i.e. systems of systems of all sorts

(40)

The Ten areas covered by the 60 core WS-*

Specifications

WSRP (Remote Portlets) 10: Portals and User

Interfaces

WS-Policy, WS-Agreement 9: Policy and Agreements

WSDM, WS-Management, WS-Transfer 8: Management

WSRF, WS-MetadataExchange, WS-Context 7: System Metadata and State

UDDI, WS-Discovery 6: Service Discovery

WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation

5: Security

BPEL, WS-Choreography, WS-Coordination 4: Workflow and

Transactions

WS-Notification, WS-Eventing (Publish-Subscribe)

3: Notification

WS-Addressing, WS-MessageDelivery; Reliable Messaging WSRM; Efficient Messaging MOTM 2: Service Internet

XML, WSDL, SOAP 1: Core Service Model

(41)

WS-* Areas and Web 2.0

Start Pages, AJAX and Widgets(Netvibes) Gadgets 10: Portals and User

Interfaces

Service dependent. Processed by application 9: Policy and Agreements

WS-Transfer style Protocols GET PUT etc. 8:

Management==Interaction

Processed by application – no system state –

Microformats are a universal metadata approach 7: System Metadata and

State

http://www.programmableweb.com 6: Service Discovery

SSL, HTTP Authentication/Authorization, OpenID is Web 2.0 Single Sign on

5: Security

Mashups, Google MapReduce

Scripting with PHP JavaScript …. 4: Workflow and

Transactions (no

Transactions in Web 2.0)

Hard with HTTP without polling– JMS perhaps? 3: Notification

No special QoS. Use JMS or equivalent? 2: Service Internet

XML becomes optional but still useful SOAP becomes JSON RSS ATOM

WSDL becomes REST with API as GET PUT etc. Axis becomes XmlHttpRequest

1: Core Service Model

(42)

WS-* Areas and Multicore

Web 2.0 technology popular 10: Portals and User

Interfaces

Handled by application 9: Policy and Agreements

Interaction between objects key issue in parallel programming trading off efficiency versus

performance 8: Management ==

Interaction

Environment Variables 7: System Metadata and State

Use libraries 6: Service Discovery

Not so important intrachip 5: Security

Many approaches; scripting languages popular 4: Workflow and

Transactions

Publish-Subscribe for events and Interrupts 3: Notification

Not so important intrachip 2: Service Internet

Fine grain Java C# C++ Objects and coarse grain services as in DSS. Information passed explicitly or by handles. MPI needs to be updated to handle non scientific applications as in CCR

1: Core Service Model

(43)

CCR as an example of a Cross Paradigm

Run Time

Naturally supports fine grain thread switching with

message passing with around

4 microsecond

latency for

4 threads switching to 4 others on an AMD PC with C#.

Threads spawned – no rendezvous

Has around

50 microsecond latency

for coarse grain

service interactions with DSS extension which supports

Web 2.0 style messaging

MPI Collectives – Shift and Exchange vary from

10 to

20 microsecond latency

in rendezvous mode

Not as good as best MPI’s but managed code and

supports Grids Web 2.0 and Parallel Computing ……

(44)

PC07Intro gcf@indiana.edu 44

Microsoft CCR

Supports exchange of messages between threads using

named

ports

FromHandler:

Spawn threads without reading ports

Receive:

Each handler reads one item from a single port

MultipleItemReceive:

Each handler reads a prescribed number of

items of a given type from a given port. Note items in a port can

be general structures but all must have same type.

MultiplePortReceive:

Each handler reads a one item of a given

type from multiple ports.

JoinedReceive:

Each handler reads one item from each of two

ports. The items can be of different type.

Choice:

Execute a choice of two or more port-handler pairings

Interleave:

Consists of a set of arbiters (port -- handler pairs) of 3

types that are Concurrent, Exclusive or Teardown (called at end

for clean up). Concurrent arbiters are run concurrently but

exclusive handlers are

(45)

Overhead (latency) of AMD 4-core PC with 4 execution threads on MPI style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 10 seconds divided by number of stages

Stages (millions)

Time

Microseconds

Rendezvous exchange as two shifts Rendezvous exchange customized for MPI

(46)

Overhead (latency) of INTEL 8-core PC with 8 execution threads on MPI style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 15 seconds divided by number of stages

Stages (millions) Time

Microseconds

Rendezvous exchange as two shifts Rendezvous exchange customized for MPI

(47)

PC07Intro gcf@indiana.edu 47

Timing of HP Opteron Multicore as a function of number of simultaneous two-way service messages processed (November 2006 DSS Release)

n CGL Measurements of Axis 2 shows about 500 microseconds – DSS is 10 times better

References

Related documents

We observe that over a range of regression problems, median tree sizes are reduced by around 20% largely independent of test function, and that while some large subtrees are

governmental unit, the governmental unit shall make available a receiving receiving facility that can lawfully accept all septage waste generated wi. facility that can lawfully

In addition, the GC&N Committee is responsible for: (i) assessing the effectiveness of the Board, each of its committees and individual Trustees; (ii) overseeing

Major Concentration in Biology with a Minor in Chemistry for Teachers – section 13.34.5: Concurrent Bachelor of Science (B.Sc.) and Bachelor of Education (B.Ed.) - Major

(vi) that py audited all the partners it was supposed to audit. As any other node, the source also maintains partnerships and regularly changes its partners, i.e., the nodes it

The impact of a 'stage 1' trade liberalisation is comparable to a real depreciation of 0.59 per cent per annum and that of a 'stage 2' trade liberalisation is equivalent to a 1.57

After the simple setup of combined parameters with SCPI commands, WLAN list sequence performs measurements on all 45 bursts in seconds and returns the transmitter power, SEM,

A sensitivity analysis has been performed considering the variation in PV panel cost (comparing the base case of 0.55 €/Wp to a case with a higher PV cost of 1 €/Wp), the variation