• No results found

Resource Scheduling Technology

N/A
N/A
Protected

Academic year: 2021

Share "Resource Scheduling Technology"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

A Cl

d C

i

Pl f

f

FY 4 B

d

A

 

Cloud

 

Computing

 

Platform

 

for

 

FY

4

 

Based

 

on

 

Resource

 

Scheduling

 

Technology

Xiangang Zhao Manyun Lin Lan Wei Lizi Xie Zhanyun Zhang Peng Guo Xiangang Zhao, Manyun Lin, Lan Wei, Lizi Xie, Zhanyun Zhang, Peng Guo

National Satellite Meteorological Center ,CMA

5th Asia‐Oceania Meteorological Satellite Users Conference 

(2)

Outline

Outline

1. IT scale of FY-4 ground segment

2 M j

h ll

d

l ti

2. Major challenges and solutions

3 FY 4 IT architecture design

3. FY-4 IT architecture design

(3)

1 IT scale of FY

4 ground segment

1

 

IT

 

scale

 

of

 

FY

4

 

ground

 

segment

• Second generation geostationary satellite 4 instruments larger amounts

• Second generation geostationary satellite, 4 instruments, larger amounts 

of data, more products.

• Computing capability requirement: 340TFlops.

• Storage capacity requirement: 11PB.

Storage capacity requirementsTB

Computing capability requirementsTFlops

/FY-3C

Storage capacity requirementsTB

CNS NRS PGS, 5587

DSS, 18169 ADS, 12557

Computing capability requirementsTFlops

CNS NRS /FY 3C CNS, 5220 DSS, 4000 NRS CVS MCS ADS CNS, 182917.6 MCS 19884 8 SWS, 2822.2 DTS, 42998 CVS MCS SWS CVS, 1225 ADS, 450 SWS, 50 SWS DSS , CVS, 38214 MCS, 19884.8 DTS PGS DSS ADS

National Satellite Meteorological Center ,CMA 

5th Asia‐Oceania Meteorological Satellite Users Conference 

NRS, 200 MCS, 280

NRS, 15727.6

(4)

Outline

Outline

1. IT scale of FY-4 ground segment

2 M j

h ll

d

l ti

2. Major challenges and solutions

3 FY 4 IT architecture design

3. FY-4 IT architecture design

(5)

Major Challenges

Major

 

Challenges

 

How to achieve high reliability and high performance?

How

 

to

 

achieve

 

high

 

reliability

 

and

 

high

 

performance?

How

 

to

 

share

 

resources

 

and

 

save

 

costs?

Wh t b

t

i

d

d ? H

t b

k

What

 

about

 

expansions

 

and

 

upgrades?

 

How

 

to

 

break

 

information

 

islands

 

and

 

build

 

a

 

sustainable

  

system

 

for

 

FY

4A FY

4B FY

3 and

4A,FY 4B,FY 3,

 

and

 

National Satellite Meteorological Center ,CMA 

(6)

Solutions

‐‐

Adopt the IT architecture of FY

2

Solutions

‐‐

Adopt

 

the

 

IT

 

architecture

 

of

 

FY

2

 

• Adopt the IT architecture of FY 2

• Adopt

 

the

 

IT

 

architecture

 

of

 

FY

2

 

• Choose Unix servers and high‐end storage system

• Set up an exclusive system for each satellite

d l b l d f

• Good reliability and performance

(7)

Solutions

‐‐

Adopt a new kind of IT technology

Solutions

‐‐

Adopt

 

a

 

new

 

kind

 

of

 

IT

 

technology

  

• Cloud computing as a new kind of IT technology is widely applied. 

• High scalability, rapid deployment speed, cost savings and so on.

National Satellite Meteorological Center ,CMA 

(8)

Application of cloud computing

Application

 

of

 

cloud

 

computing

Cl d ti i l li d i th d t f t llit h • Cloud computing is also applied in the ground segment of satellites, such  as GPS, communication and meteorological satellites. • Nebula is one of NASA Cloud Computing Platforms, for data sharing and p g g application supporting such as climate prediction. • According to Gartner, Inc. ,nearly half of large enterprises will have cloud 

deployments by the end of 2017 deployments by the end of 2017.

(9)

Outline

Outline

1. IT scale of FY-4 ground segment

2 M j

h ll

d

l ti

2. Major challenges and solutions

3 FY 4 IT architecture design

3. FY-4 IT architecture design

4. Summary and plan

y

p

National Satellite Meteorological Center ,CMA 

(10)

3 IT architecture design of FY 4

3

 

IT

 

architecture

 

design

 

of

 

FY

4

Schedule system design

Schedule

 

system

 

design

• The separation of operation scheduling and resource scheduling brings rich flexibility.

• Operation scheduling need not care about the underlying platform architecture.

• Resource scheduling need not care about operation logic and only concentrates on  resource management and a single job scheduling.

(11)

The cloud platform architecture of FY

4

The

 

cloud

 

platform

 

architecture

 

of

 

FY

4

National Satellite Meteorological Center ,CMA 

(12)

Architecture description

Architecture

 

description

The infrastructure layer organizes all the medium 

and low level heterogeneous physical resources

and low level heterogeneous physical resources 

such as computing, networking and storage to 

supply high performance computing power, high‐

(13)

Architecture description

Architecture

 

description

The resource scheduling layer achieves the unified pool 

management of heterogeneous computing resources and 

designs fault‐tolerant mechanisms that deal with resources and 

application exceptions to ensure high efficiency, flexibility and 

l b l f h

(14)

Architecture description

Architecture

 

description

The job scheduling bus layer is designed to provide a 

standard interface for job submission of application layer and 

is compatible with LSF, PBS, and other operation Scheduler in 

the resource scheduling layer. Corresponding to a meta‐

Scheduler, this layer can forward jobs to their appropriate 

schedulers, in which fault‐tolerant strategies for fault 

(15)

Architecture description

Architecture

 

description

The application layer is used to provide the user interface 

(16)

Equipment selection and resource pooling design

Equipment

 

selection

 

and

 

resource

 

pooling

 

design

U i S

Computing capability distribution diagram

Unix Server 17% Blade Server PC Server 28% Unix Server Blade Server PC Server Blade Server 55%

Storage capacity disribution

high‐end 

storage 38% low‐end 

high‐end storage low‐end storage storage

(17)

Key algorithms(1/2)

Key

 

algorithms(1/2)

1. Resource

 

failure

 

processing

 

algorithm

h l d f l h dl h f l d

• When a single computing node fails, it can handle this failure and  move all the jobs on this node to other nodes.

2. Resource

 

group

 

failure

 

processing

 

algorithm

Wh ll it t t it ll

• When a resource group collapses, it can try restore it or move all  the jobs on this group to other groups including related data 

migration if the restoration fails.

National Satellite Meteorological Center ,CMA 

(18)

Key algorithms(2/2)

Key

 

algorithms(2/2)

3. Job

 

failure

 

processing

 

algorithm

• When a job fails, it can redo it or move it to another computing  d

node.

4. Scheduler

 

failure

 

processing

 

algorithm

• When a scheduler becomes invalid, it can recover it or move all  the jobs on this scheduler to other schedulers including related  data if the recovery fails.

5. Load

 

balance

 

scheduling

 

algorithm

• According to scheduling strategy, it aims to optimize resource  usage, maximize throughput, minimize response time, and avoid  overload of any single resource.

(19)

Outline

Outline

1. IT scale of FY-4 ground segment

2 M j

h ll

d

l ti

2. Major challenges and solutions

3 FY 4 IT architecture design

3. FY-4 IT architecture design

4. Summary and plan

y

p

National Satellite Meteorological Center ,CMA 

(20)

Summary

Summary

• Setting up resource pooling using general devices without virtualization  technology enhances the expansibility and improve the system  performance to price ratio. It can save 60% money for computing  servers theoretically. 60%=80%* (1‐1/4) 60%=80%  (1‐1/4) • Resource scheduling including load balancing scheduling and fault  tolerance mechanism can ensure the reliability and efficiency of the  • The architecture is still in design stage, more problems need to be  system. solved during the implementation phase in the future.

(21)

Plan and advice

Plan

 

and

 

advice

Share

 

a

 

cloud

 

for

 

all

 

FengYun satellites

 

in

 

the

 

future.

Make

 

full

 

use

 

of

 

social

 

resources

 

to

 

gain

 

standard

 

ti

d t

it

computing

 

and

 

storage

 

capacity.

Design

 

the

 

interface

 

between

 

private

 

and

 

public

 

cloud

 

and provide data sharing conveniently for the public

and

 

provide

 

data

 

sharing

 

conveniently

 

for

 

the

 

public.

Advice:

Carry

 

out

 

more

 

exchanges

 

about

 

IT

 

architecture

 

in the future.

in

 

the

 

future.

FengYun Cloud

National Satellite Meteorological Center ,CMA 

(22)

References

Related documents

education services and developing an IEP that ensures a child with a disability receives a “free and appropriate education” (FAPE) in the “least restrictive environment”

In this part of the chapter I discuss the representation of immigrant workers. I describe six frames through which migrant workers are portrayed in the selected films: as

Qiejmr (iabemnm!y -cec!ir  Qiejmr (iabemnm!y -cec!ir  Crabjtiat  Vrjeajpcn ?e!jeiir Crabjtiat  Vrjeajpcn ?e!jeiir. Qiejmr0-cec!ir

However, when there is a positive probability that a politician who was ousted from office in the past will stand for reelection in the future, reputational concern may induce a

• Resource Adequacy Storage resources count for new, reliability system capacity.. Luna Summary –

Type of Work Maximum Dry Density with heavy Compaction – IS: 2720 (Part 8). Embankment upto 3 m height, not subjected

Thus, in 1986 an explicit policy was announced, identifying software as one of the key sectors in India’s agenda for export promotion, and underlining the importance of an