• No results found

Executive summary. Key points

N/A
N/A
Protected

Academic year: 2021

Share "Executive summary. Key points"

Copied!
50
0
0

Loading.... (view fulltext now)

Full text

(1)

June 2011

Date: 27/06/2011

Report of the e-Infrastructure

Advisory Group

(2)

2

Executive summary

This report sets out the findings and recommendations of the e-Infrastructure Advisory Group

commissioned and chaired by BIS into the activities and recommendations put forward by the 2009 RCUK

International Review of e-Science and the 2010 report “Delivering the UK’s e-Infrastructure for Research

and Innovation”.

The advisory group was composed of representatives from UK Research Funders (RCUK and Wellcome

Trust), Higher Education Funding Councils (DELNI, HEFCE and SFC) and Universities UK. Its membership

and Terms of Reference are at Annex A to this report.

As well as reviewing these reports, a consultation exercise was undertaken to provide a baseline of the

current level of infrastructure provision at the local and national level, an understanding of the processes

by which future provision is currently determined and to highlight infrastructure elements where support

and development is required to enable sustainability. This consultation canvassed UK research councils,

UK HEIs and selected end-user organisations to provide a range of perspectives to take into account

geographical issues (e.g. consortia, regional funding etc), institutional research focuses and degree of

utilisation and experience of e-infrastructure.

Key points

The Advisory Group recognised the following emergent points from the review and consultation exercise:

• A priority need for the continuation and development of a dedicated research network infrastructure

such as that currently provided by JaNET for researchers requiring the highest data bandwidth and

lowest latencies;

• The decline in research grade e-literacy among UK researchers was seen as a concern and this was

magnified by the decreasing flow of computational specialists from undergraduate to postgraduate

level;

• Addressing big, small, and persistent data issues requires much more coordination and clearer

leadership to understand the variation in requirements and importance across the research base in

order to design a clear and tractable response;

• The high level of investment and commitment to the development of e-infrastructure at the HEI level

is encouraging and has led to significant growth in UK capability. To ensure continued growth there is

now a need for coordination to ensure that this is sustainable and joined-up to national and regional

drivers strategic drivers as well as the ambition of individual HEIs;

• There is a clear and ongoing need for provision of a HPC National service to provide the capability to

tackle leading edge computational science and simulation with mid-range resource being provided

on a regional/research cluster basis;

• Scientific software development is a clear area for action to derive the maximum benefit from

current and next generation computer architectures and provision paradigms;

• Future leadership and development of strategy for e-infrastructures needs to be taken forward with

a matrix of stakeholders rather than a single organisation. Development will also need reflect the

varying rates of development and maturity for each element in the e-infrastructure;

• Funding from Central Government should be reserved for infrastructure that would provide a

recognised National capability and/or provide benefit to more than one research institution or

community;

(3)

3

Recommendations

To address the key points identified the Advisory Group makes the recommendations that:

• Future development of activities be taken forward via a stranded approach recognising the fact that

each element of the overall e-infrastructure has its own timescales with respect to methodological

and technological development;

• The following strands form the focus for developments: Research Networks, People & Skills, Data,

Compute Infrastructure, Software Development and Security & Authentication;

• The continuing provision and development of a UK Research Intensive Network Infrastructure be

treated as the highest priority. It is further recommended that this strand be taken forward with the

full engagement of UK Research Funders and research leaders;

• Responsibility for leading the People and Skills strand lie with the Higher Education Funding Councils

and that this be taken forward in partnership with the RCUK and learned Societies to ensure

alignment with research priorities and skills shortage areas;

• Development of the Data strand to be taken forward by a consortium of Research Funders (RCUK and

Charities) to cover the breadth of need across the research base and to capture the texture of

differing applications within the strand;

• Research Councils retain their coordinating role for investment in National High Performance

Computing facilities and that the scope for partnership in the provision of compute resource be

widened to include engagement with research intensive universities to identify synergies or

alternative models for HPC provision. To include analysis of application areas that may be efficiently

migrated to appropriate “Cloud” resources;

• Current investments in software development be reviewed to identify successful models of support

with a view to treatment of scientific software as a Research Infrastructure. This to be taken forward

by Researchers and Research Funders working in partnership;

• Development of authentication and security be continued and improved by JISC to ensure that

transparent use and benefits of a Nationally shared e-infrastructure can become a reality;

Next Steps

The Advisory Group has suggested an outline governance structure for progression in the section 1.4 of

this paper and recommends that BIS take responsibility for its implementation and development of the

overall strategy for an integrated e-infrastructure. Within this it is recommended that:

• RCUK lead the development of the Compute Infrastructure, Data and Software Development strands;

• JISC lead on the development of the strands for Research Networks and Authentication & Security;

• UK Higher Education Funding Councils lead on the development of the People and Skills strands;

These assignments are recommended on the basis of the current and future expected remits and

responsibilities of these organisations.

(4)

4

1.0

Summary and recommendations

1.1

Introduction

This report sets out the findings and recommendations of the e-Infrastructure Advisory Group

commissioned and chaired by BIS into the activities and recommendations put forward by the 2009 RCUK

International Review of e-Science and the 2010 report “Delivering the UK’s e-Infrastructure for Research

and Innovation”. The ‘Advisory Group’s terms of reference and its membership are at Annexe A to this

paper.

The advisory group was composed of representatives from UK Research Funders (RCUK and Wellcome

Trust), Higher Education Funding Councils (DELNI, HEFCE and SFC) and Universities UK (Annexe A)

In reviewing the recommendations of both the 2009 International Review and 2010’s e-Infrastructure

report the ‘Group determined that its principal focus should be to provide a framework within which to

advance the 2010 e-Infrastructure report’s recommendation

1

that:

“The UK’s Research and Innovation e-infrastructure needs to be led and driven to deliver a UK wide vision

for research e-infrastructure, embedded in the international context essential to today’s research

challenges. The leadership must provide a multi-year perspective, identify best practice, coordinate

stakeholder investment and champion relevant and fit for purpose cross-disciplinary standards to

facilitate coordination”

To aid in the development of the framework, a consultation exercise was undertaken to provide a

baseline of the current level of infrastructure provision at the local and national level, an understanding

of the processes by which future provision is currently determined and to highlight infrastructure

elements where support and development is required to enable sustainability.

As well as canvassing UK research councils, the consultation targeted UK HEIs and selected end-user

organisations (Annex B). These organisations were chosen to provide a range of perspectives to take into

account geographical issues (e.g. consortia, regional funding etc), institutional research focuses and

degree of utilisation and experience of e-infrastructure; the invitations to participate in the consultation

are reproduced at Annexe C. A summary of the identified trends across all organisations is given at

Section Y.

1.2

Principal findings

The consultation exercise and review by the Advisory Group highlighted the following key points:

• The need for a dedicated computer network for researchers requiring the highest bandwidth and

lowest latencies is a critical element of the UK e-Infrastructure. This was evidenced by the almost

unanimous support for Super JANET 6 or a network with equivalent functionality in the responses

received from HEIs;

• The Group noted that many of the responses highlighted a strong connection between software,

people & skills. Although the Group recognised and appreciated the linkages and need for

development in these areas, it was felt that they need to be separated in to ensure appropriate

treatment. The Group also noted the challenges and potential sensitivities surrounding the

development of a skills strand given current policy relating to Higher Education and student choice.

1

(5)

5

HEFCE would liaise with the other HE funding bodies and, in England, consider the supply of

graduates and skills though the HE system as part of the review of Teaching Funding.

• Dealing with data is an area where further coordination and scoping needs to occur in order to both

understand the variation of requirement and the relative importance of data across the research

base. This will be necessary to generate an appropriate and tractable response at both a local and

national level.

• In recent years, there had been a high level of bottom up activity within HEIs and a lot of hardware

(especially in the area of HPC) had been funded through this mode. Although the Group saw these

significant local investments as encouraging, there appeared to be an emergent need for further

coordination to ensure that appropriate growth could be sustained and that future investments are

“joined-up” to regional and national strategic drivers. Leading on from this there is a clear and

developing need for sharing of hardware resources, especially as the scale of requirement stretches

beyond the ability of individual organisations to finance, procure and provide the necessary support

infrastructures. As well as maintaining the competitiveness of the UK Research Base’ equipment

sharing would deliver concomitant benefits in increased accessibility, robustness with respect to

network security, integrity, system resilience and efficiency savings in excess of current

arrangements.

• The Group also identified from the responses that there was is a clear and ongoing need for provision

of HPC systems at the National level with an emerging trend toward provision of mid-range systems

on a regional/research cluster basis. However the need for dedicated systems should be reviewed

against emerging cloud provision targeted at traditional HPC application areas;

• Software development at all levels of the software stack is a clear area for further action in order to

fully utilise current investments and to put the UK in a position to exploit next generation computing

architectures (e.g. many-core and GPU architectures) and paradigms (e.g. cloud computing and data

intensive computing);

• The Group recommends that future strategy development in the area of e-infrastructure needs to be

taken forward by a matrix of stakeholders rather than any single council, institution or sector taking

on both leadership and development.

• With regard to Research Funder or direct Government funding of infrastructure; The Group felt that

this should be reserved for projects that would deliver a National capability (i.e. at a level that could

not be provided without strategic investment) or would provide benefit to more than one research

institution.

1.3

Recommendations

The e-Infrastructure Advisory Group has made the following recommendations for the strategy

development and progression of activities:

The group recommends that future strategy development and activities need to be taken forward via a

stranded approach. This approach recognises the fact that each element of the overall e-infrastructure

has its own characteristics and timescales with respect to methodological and technological

development. An all-encompassing strategy would not have the resolution to adequately capture the

level of detail or be able to respond on an appropriate timescale to opportunities these developments

may bring. The Advisory Group identified six strands that were to be taken forward, namely: Research

Networks, People & Skills, Data, Compute, Software Development and Security & Authentication.

Within these priorities the IAG were agreed and recommended that the continuing provision of a UK

network infrastructure, capable of addressing the future needs and aspirations of research intensive

(6)

6

institutions was a critically important element in the provision of a competitive e-infrastructure for the

UK Research Base and should be treated as a priority. With respect to the other identified strands, the

IAG agreed that a coordinated approach in each of these areas would deliver a suite of enabling tools and

services that would deliver a significant net benefit to the Research Base enhancing both scientific

capability and the potential for international collaboration. The advisory group made the following

comments and recommendations on each strand of activity:

• Networks – The Group felt that this was of the highest priority and that a strategy should be

developed for the delivery of a “Research Intensive Network Infrastructure” aligned to, and driven

by, the needs of the UK’s most research-intensive universities and institutes. An organisation with a

role such as that currently provided by JISC would be best placed to take forward the responsibility of

leading this strand given its background with the current JANET network and experience of

undertaking other initiatives of this kind. Strategy development should be undertaken with a full

appreciation and understanding of user needs. As such:

o

Engagement with UK Research Funders (Higher Education Funding Councils, Research Councils

and well as Charities) and Research Leaders (individuals and institutions) is strongly

recommended and will be crucial in delivery of this strategy.

• People & Skills – This was another area where the group felt that there needed to be coordinated

action due to the decreasing flow of highly skilled software engineers into research. The Group also

felt that consideration should be given to ways in which the level of general e-literacy in software

development could be increased within future postgraduate cohorts to increase the potential for

spillover into computationally intensive research and provide individuals with a firm basis on which

to continue their own careers. This need was highlighted in many of the responses received to the

consultation. However, as stated above the group felt that there were sensitivities with respect to

the evolving situation on student fees and any strategy in this area would have to be carefully shaped

to address this. As such, the Advisory Group recommends that:

o

Responsibility for leading this strand lie with the Higher Education Funding Councils and be

taken forward with input from Research Councils and Learned Societies to ensure alignment

with research priorities and skills shortage areas.

This broad range of engagement recognises the very wide spectrum of needs and abilities within and

across research domains and that there appears to be little or no correlation (nor anti-correlation)

between academic seniority and degree of e-literacy. In order to respond to this challenge the

Advisory Group were agreed that the response will need to institute a broad series of actions to

reflect the variation in skills levels across domains and that training would be required at all levels (of

ability and seniority). It is unlikely that this will be amenable to a one size fits all approach.

• Data – Many of the responses to the consultation focussed on common themes regarding data such

as its production, retention, curation and strategies for dealing with these issues and the Group

agreed these were areas where a coordinated approach would be necessary. In addition:

o

The Group recommended that this list should be expanded to include the rapidly increasing

amounts of descriptive metadata associated with the underlying data. As well as these

considerations, the underlying standards and interoperability issues need to be addressed in

order to ensure future relevance and usability of data;

o

The Group also recommended that existing models, methodologies and infrastructures

developed by research councils and charities should be reviewed for applicability to other

research areas, exploiting the lessons learned, rather than starting from the ground up.

(7)

7

• It was also recognised that data was an area where there would be considerable variation in needs

and stakeholders.

o

The advisory group recommended that a consortium of Research Funders (Research Councils

and Charities) lead on coordinating this strand to cover the breadth of need across the research

base.

• The exact composition and chairing of this group will need to be decided and will need to take into

account linkages with HEIs and JISC. Particular issues stemmed from the very rapid increase in rates

of data generation in a variety of fields, how these quantities of data might require (and permit) new

kinds of science, and how the availability of data played to the Open Access agenda while raising

serious legal and ethical issues and (for individuals) issues of identifiability even in nominally

anonymised data.

• Compute – The group identified that future compute provision was an area where there was a clear

and pressing need for equipment sharing given the increasing scale and complexity of procuring,

supporting and provisioning for HPC systems. In addition, it was recognised that the current model of

provision at the mid-range, where each research-intensive institution invested in its own HPC

hardware could falter under the increasing need for efficiencies outlined under the Wakeham review

and its effects on capital investment and indirect cost elements of fEC. It was recommended that

o

Research Councils should retain their coordinating role for National investments in High

Performance Computing.

o

The scope should be widened to include engagement with research-intensive universities in

order to determine synergies or alternative models for HPC provision.

o

Further analysis of the potential use of Cloud resources should be conducted to determine areas

of the Research Base where Cloud would provide a technologically viable alternative to local

cluster systems and that capacity planning be undertaken to determine the potential usage and

economic viability of Cloud as a replacement for the capabilities currently provided locally.

o

Any future road mapping exercise take into account energy efficiency and green provisioning

(e.g. shared data-centres, systems and increased infrastructure efficiency) issues alongside

traditional metrics for investment appraisal.

• Software – As noted earlier, software development at all levels of the software stack is a clear area

for further action in order to fully utilise current investments and to put the UK in a position to

exploit next generation computing architectures (e.g. many-core and GPU architectures) and

paradigms (e.g. cloud computing and data intensive computing) across application domains. In order

to ensure that this can take place, future developments in this strand will need to be informed and

directed by the UK Research Base to ensure a full understanding of, and to develop an appropriate

response to, the challenges across research domains. To facilitate this it is recommended that:

o

Researchers and Research Funders work closely with each other to make the appropriate

choices e.g. in the choice of commercial software, new community generated or re-engineering

of existing application codes to support a research area.

Initially this may take the form of the research councils application-to-architecture matching

activities which has the ultimate aim of creating tool-kits and best practice to provide informed

investment and usage decisions to be made on such aspects as compiler and architecture type. The

group also recommended that:

(8)

8

o

Current investments in software development should be reviewed with a view to developing

models of support for software as a sustained infrastructure in the long term, as opposed to

being supported by significant one off investments;

• Authentication and Security – The group recommended that

o

The development of robust authentication and security systems to enable trusted users to utilise

shared infrastructures in an open manner needed to be treated as a clear priority if the benefits

of shared infrastructure and collaboration were to be realised.

As the area of interest would be related to the resources shared over JANET or a future Research

Network it was recommended that responsibility for developing this strand should again lie with JISC and

take into account expert advice from the e-science community in future development to ensure any

framework is fit for purpose.

The Advisory Group feels that the strands identified above form a clear programme of strategic areas for

activity and provide a framework upon which an integrated e-infrastructure strategy can be built. It is

envisaged that the development of the scope, strategy, deliverables and phasing of those deliverables

will be developed within each of the strands. The group recognises that effective coordination and the

integrated phasing of activities will be a highly important factor in the successful deployment of the

framework. Given the complexity involved in the above activities and the stakeholder relationships it is

recommended that the leadership role lie with BIS.

1.4

Next Steps

The advisory group recognised that as well as the need for agility, each identified strand would need to

capture the research texture and user requirements within each strand. This is necessary to ensure that

sufficient breadth and appropriate concentration of provision for each of the elements is taken into

account. In order to progress this it is suggested that:

• Each strand be assigned to and led by a body with the appropriate research oversight (e.g. Research

Councils) or responsibility for technical provision (e.g. JISC) ;

• Policy and strategy development within a strand be informed and driven by a Research Strategy

and/or Technical Advisory Stream(s) which will incorporate relevant knowledge, research leadership

and technical expertise to ensure relevance and benefit of proposed developments/activities to the

research base and ensure technical feasibility. It is anticipated that this membership would be drawn

largely from the UK research Base and include representation from the lead body for the strand to

keep informed from a funder/provider perspective ;

• To ensure tensioning and communication between the infrastructure strands, an e-Infrastructure

Board/Forum would be set up with senior representation from each strand along with

representatives from the lead bodies for each strand. This would be the highest-level body in this

structure and be responsible for the development of an integrated roadmap of of activities for

progression in the short, medium and longer term. The Board would also commission research

funders to formulate the response. It is suggested that this board/forum would initially be chaired by

BIS;

The Advisory Group suggests the above as an outline structure only and is agreed that the final leadership

and governance models for development of the framework be taken forward by BIS in partnership with

Research Funders and other key community stakeholders.

The Advisory Group recommends that RCUK take the lead on Compute, Data and Software

Development strands with JISC taking on responsibility for Networks and Authentication and Security.

It also recommends that the People and Skills agenda be taken forward by the Higher Education

Funding Councils, reflecting the need for action at the Undergraduate as well as Post Graduate level.

(9)

9

2.0

Responses received to the e-Infrastructure Advisory Group consultation

exercise – January 2011

2.1

Background to the consultation

2.1.1

As part of its evidence gathering, the Advisory Group agreed that a short consultation be

conducted to gain perspectives from HEIs, Research Councils and end-user organisation

organisations with a perceived dependence on e-infrastructure in their business.

2.1.2

The UK academic research institutions were chosen to give an appropriate mix of coverage to

account for geographical issues (e.g. consortia, regional funding etc), organisational research

remit and degree of utilisation and experience in using e-infrastructure.

2.1.3

The following summarises the responses from each stakeholder group under four key headings

to enable cross comparison: “Compute”, “Data”, “Networks”, and “People, skills and software”

2.1.4

A summary of the emergent findings from the consultation is presented at section 4.

2.2

Research Councils

2.2.1

Networks

Although the drivers for Research Councils with respect to networks are different, all are agreed that

continued evolution in terms of the bandwidth and support provided by JANET is key to research

delivery, given the data issues outlined above and the move to more collaborative, multi-site modes of

working. Of greater importance to researchers in the EPSRC space, rather than the ability to accumulate

large, constantly accessed, highly available data sets is the ability to transport results of completed

simulations to their institutions for post processing and visualisation activities. The links into European

network infrastructures such as GEANT and beyond are also crucial especially for those research areas

that are involved in major international collaborations.

2.2.2

People, Skills and Software

Across all councils the need for well developed, robust and readily usable software is seen as key to

science delivery. In addition, a steady stream of people with the skills necessary to harness current and

future infrastructure is also recognised. However, there are very few explicit, examples of current support

structures or delivery mechanisms for these elements in the responses provided. Notable current

examples of activity are EPSRC reshaping of its balance of investment in e-infrastructure to include

software development as a key strand of its plans. By funding short, medium and long-term development

activities to enable maximum return from current research platforms with a view to future research

needs. Provision has also been made available for short courses at the PhD and Post Doctoral level. ESRC

also provides embedded support to increase supply of skilled individuals through the National Centre for

Research Methods and as part of their strategy for supporting postgraduate training at doctoral training

centres.

2.2.3

Data

Data in terms of its creation, transmission, curation and use is an area that has come rapidly to

prominence for the research councils. Reasons for this are that advances in experimental techniques (e.g.

next generation DNA sequencing), experimental complexity (e.g. increase detector resolution and

fidelity) and experimental sophistication/scale (e.g. geospatial sensor and data sets, model data) have

become increasingly pervasive in day-to-day research as opposed to the preserve of a few key groups.

The so called “data deluge” is recognised by BBSRC, NERC, STFC, MRC and ESRC and feeds directly into

(10)

10

consideration for future provision in terms of the types of service models that could support the growing

needs for computing hardware, software and networking capabilities for data driven science within those

councils which is fuelling interest in cloud type system solutions . EPSRC recognises the need for a

longer-term plan for research data from simulation and data. However, consultation with the EPSRC community

revealed this to be of secondary importance compared to availability of internationally competitive

compute capabilities.

2.2.4

Compute – All councils recognise the need for appropriate compute resource across their remits.

The level and types of resources currently provided are in line with the research challenges and

bottlenecks that each faces; more resolution on individual needs is provided in the individual

responses.

• High Performance Computing - In the case of EPSRC, STFC and NERC there is a clear and

continuing need for High Performance Computing (mid to high Tera- through to Petascale) to

tackle close-coupled problems that are not currently amenable to solution via a distributed

or cloud resource. These councils are looking to continued National (mid-term) and

International (long-term) collaborations/partnerships to enable continued competitiveness

in dependent fields such as turbulence simulation, climate change, local or long term

weather prediction, and quantum chromo-dynamics. BBSRC has previously seen a need for

this type of compute provision (HPCx) although it has seen its increased investment in

HECToR, which is essentially unused at the current time. Possible contributory factors for this

are stated in their response. ESRC and MRC do not currently see the lack of HPC in their

portfolio as a constraint in delivering their plans although ESRC does anticipate increasing

computational demand on the 2014 timescale.

• Shared/distributed, cloud and ‘novel’ resources – At a National level the Research Councils

do not collectively fund a distributed compute resource although at an individual research

council level EPSRC has funded the National Grid Service as part of the core e-Science

initiative, although direct funding for this will be discontinued post March 2011. Individual

councils do operate shared infrastructure although this is generally to exploit investments

made in experimental facilities (e.g. STFC investment in the e-Science Centre at RAL

supporting ISIS, DLS and CLF) and increasingly for transfer and access to large data sets in the

case of STFC, ESRC, NERC and rather than for use as a distributed/cloud type compute

resource. With respect to cloud computing, although this is a new paradigm (arguably based

upon previous e-research) it has gained a very high degree of interest from all Councils.

Principally this is in areas where data analysis is a key consideration i.e. where high levels of

storage (Petascale and beyond) and an elastic compute capability are needed in close

proximity for analysis and interpretation. Councils that have stated a clear interest in this

area are MRC, BBSRC, ESRC and NERC. EPSRC is also looking at the opportunities for access

to compute via the Cloud and, working with JISC, has recently funded (start date Feb 2011) a

small number of projects to evaluate their use. In the area of novel resources/architectures

there is very little take up of GPU, FPGA and other resources of these architectures due to

the currently high barrier to usage (through software coding complexity) by the average

user. EPSRC has invested in a small test bed facility associated with the HECToR service to

enable potential user to have access to a well-supported system and a high standard of

training. This will be available from March 2011 to all HECToR users. In addition, the Council

has also undertaken an extensive Architecture Comparison Exercise to develop a suite of

knowledge and tools to guide future procurements to ensure the best match possible

between user code needs and underlying hardware architecture.

Strategy for provision – EPSRC and ESRC both have clear, council initiated and led strategies for compute

provision constructed with community input and reflecting the multi-year research strategies of those

councils; a top down approach would be an appropriate description. Whilst all other councils recognise

the need for compute provision as part of their strategies, fulfilment of need is directed in a bottom up

(11)

11

manner from the community, with provision on the following basis: project-by-project (STFC, NERC,

BBSRC, MRC); block community resource (STFC’s HPC provision for some communities with LFCF funding);

involvement in external partnerships (NERC Met Office, EPSRC PRACE). Both mechanisms have pro’s and

con’s. However there is a degree of uncertainty in the sustainability of a multi-stranded, multi-funder

approach as used by STFC for HPC provision to parts of its community, this is recognised in the Council’s

response.

2.3

Higher Education Institutions

2.3.1

Networks

All institutions responding to the consultation consider that their local networking capacity available to

them is adequate for current research application with one transitioning to ten Gbit/s seeming to be the

norm across the responses. The majority also have this as an ongoing strand in their development

strategies. The universities are unanimous in their support for the continuation and further development

of the JANET (increased bandwidth and decreased latency) network infrastructure and the criticality of its

continued presence to them. Southampton and UCL make specific reference to the fact that it is currently

quicker to courier 1TB of data on a portable drive. Additionally, continued investment is key to enabling

any future uptake of Cloud systems and that continued investment should be centrally provided.

Following on from physical infrastructure, the ability for researchers to seamlessly use resources whilst

visiting host institutions via single sign on authentication is a continuing priority, with current eduroam

and Shibboleth systems being widely seen as a success.

2.3.2

People, Skills and Software

In the academic context these are inextricably linked; as stated in the Edinburgh response: “ the major

directions of change for computationally-enabled science and commerce are toward extreme scale: both

in terms of analysing vast and disparate data sets and a further thousand fold increase in computer

speed. Both will involve technology development, but also revolutions in algorithms, software and

research methods.” This statement is echoed throughout many of the responses, as there is a concern

that the flow of well-trained, experienced and specialised software developers into the research area is

decreasing at a time of increasing need. Although there are pockets of expertise such as the Edinburgh

Parallel Computing Centre, the Hartree Centre/CSED at Daresbury and some locally available expertise it

is felt this will not be sufficient to sustain future development of science applications. Although the

Bristol response makes reference to up-skilling or re-skilling of staff such as librarians this may not be

enough. As such there is also a need for increasing the general “e-literacy” of postgraduate researchers

across the Research Council remits in general computer science and software engineering skills to

increase the possibility of spill-over into the creation of highly skilled scientific software engineers of the

future and to ensure postgraduates are armed with the skills necessary to further their careers.

2.3.3

Data

In common with the Research Councils, the universities recognise that research has become very data

intensive and that the demand for storage is growing at an accelerated rate. Both councils and

universities recognise the same issues surrounding data are not solely related to the provision of storage

but the enabling framework that gives data its usability and value also have to be considered from the

ground up i.e. retention, management, accessibility, security and ownership. As examples of this UCL,

Southampton, Bristol are each investing in Petascale research data centres to supply well supported data

facilities in their institutions. However in common with other universities they are increasingly concerned

about the ongoing cost of data curation, especially given the requirement of funders (both charities and

research councils) to store this data for timescales that significantly exceed that of the grant or award

that first generated them. As such, universities are increasingly looking for research councils to set data

management policies for research data produced from grants that are funded by them. A point related to

“strategy for provision” in the previous section is that universities are again looking to regional alliances

(12)

12

and partnerships in order to achieve economies of scale in co-locating storage and the support

infrastructures needed for longer- term data management.

2.3.4

Compute

• High Performance Computing - Many of the universities responding to the consultation have a long

and successful history of using HPC systems and this is evident from the responses they have

submitted. At one time this capability was mainly facilitated through departmental, “Commercial of

the Shelf Technology” (COTS) type clusters and access to National facilities such as HPCx. The SRIF 3

funding round (2004) facilitated an explosion in the provision of HPC at the local level with some of

the responding institutions making significant investment in HPC that were individually comparable

to the then operating UK National and European systems. Cambridge, Cardiff, Bristol, Southampton

and UCL are notable examples procuring systems firmly between the outgoing 12TF HPCx system and

incoming HECToR Phase 1system at 60TF. These systems have been widely adopted within these

institutions with hundreds of registered users routinely using them to conduct their research and

thus support the “international excellence” strand of many university’s strategies. These systems are

now seen as being a key element in the research infrastructure of all responding universities and

perceived as an attractor to international academic talent. All universities responding have some

form of centralised HPC. Either this is in addition to departmental hardware (as above) or as a

consolidation of departmental cluster systems to counteract the expense of duplicating multiple

systems and support costs across their estates thus increasing sustainability of provision (Manchester

is a notable example of this move). As such, the universities see a clear financial as well as scientific

case for their sustained investment in HPC. In addition to supporting institutional strategies, the

systems provide a much easier transition from local to the much higher capability investments at the

National (HECToR Phase 2 and above) and international level (PRACE, TeraGrid) thus increasing the

ambition and collaborative reach of these organisations.

• Shared/distributed, cloud and ‘novel’ resources –With respect to shared and distributed resources

all responding universities have made use of deploying CONDOR services within their institutions as a

relatively low cost way of harnessing spare processing capabilities of linked PCs spread across their

estates. Although this provides a cost effective way of creating a large and useful computational

resource that would otherwise go unused, the limitations are recognised. However, the resource

does provide a step toward meeting researchers’ needs before the transition to institutional HPC

systems. As with the Research Councils, Cloud computing has created a high degree of interest in the

respondents’ institutions and all are very much aware of the possibilities that could be opened

through the Cloud model. Although the possibilities and potential of Cloud are well known to

respondents, some are wary of large-scale implementation or replacement of current services.

Barriers to entry are the seen as the ongoing service costs associated with provision by a third party,

the contractual arrangements necessary and, to some degree, the lack of control in what is actually

provided over time. This is broadly in line with the JISC cloud computing for research report. This

being noted, the Cloud model of provision is still seen as highly valid and a watching brief is being

maintained and individual researchers are being directed toward Cloud where workloads are seen as

compatible with that model. With respect to novel architectures, a number of institutions have

introduced small-scale evaluation/production systems based on GPGPU technology attracted by the

high performance vs. cost ratio. However, it is unclear from the responses what the utilisation of

these systems is like, what application areas are using them and whether these institutions intend to

scale up their commitment. From discussions with representatives of HPC-SIG, it seems unlikely there

will be mass uptake of such technology for some time due to the current complexity of porting legacy

applications to such systems because of the current lack of a mature programming environment. As

such, uptake will be restricted, as at the national level, to those with the technical programming staff

needed to overcome this barrier, or where there is a clear and specific priority to be addressed that

requires the investment.

(13)

13

Strategy for Provision – The Universities responding to the consultation have comprehensive strategies

for the provision of research computing and infrastructure within their own institutions. All institutions

have a clear appreciation of the direction of travel for e-infrastructure within their organisations and this

is informed by direct interaction with users and analysis of their current and future needs. Sustainability

is a key issue in these strategies and although modest developments can be incorporated through

reinvestment of fEC income; additional funding sources to bolster this investment should be made

available. Aligned with this is the universities own realisation that their individual requirements have

grown to such a scale that significant further development and funding cannot be undertaken in the

context of a single organisation due to physical infrastructure restrictions (power, cooling and space). As

a result these organisations are now starting to pursue partnership agreements (UCL, Oxford,

Southampton) or are becoming more open to regional alliances for all aspects of e-infrastructure (Bristol,

Cardiff, Manchester). However, the responses tend to indicate that organisations would need a steer or

framework from an interested, though independent third party such as BIS to provide a boundary

framework and enabling investment for these activities to take place.

Examples of the responses received to the consultation from the University of Bristol and Cardiff

University are included at Annexe C.

2.4

Wellcome Trust – Sanger Institute

The response received from the Sanger Institute clearly articulates the role that e-Infrastructure has to

play in the continuing success of the ‘Institute and the wider future applications of genomics in delivery

of healthcare. The WTSI is driven by the production, analysis and storage of data from sequencing

technologies. As a result, it has been drawn out here to distinguish its requirements and drivers from

those of HEIs with wider applications drivers.

• Networks – Although no specific networking environment is stated in the report, it is assumed that

as for HEIs there is a growing requirement for the ability to move large data sets in good time. In

addition, the need for robust, scalable and secure authentication systems is stated as being a priority,

especially as data becomes used increasingly used in the clinical realm.

• People, Skills and Software – In common with the research councils and university responses, these

issues have become a high priority for attention. With respect to people, the WTSI states that

recruitment of sufficient numbers of skilled staff is challenging and that this trend is likely to continue

unless further investment is put into training. On the issue of software, as with high performance

computing the underlying hardware is advancing at such a pace that is requiring continual

development of software in order to keep pace.

• Data – The current and growing data requirements for keeping apace with Next Generation

Sequencing (NGS) technologies used by the WTSI is considerable. The WTSI currently has twelve

Petabytes of storage with the associated EBI having similar amounts of raw storage space available to

it. To give some idea of the scale of this investment the HECToR National HPC service only has a

capacity of just over one Petabyte associated with the system.

• Compute - The Institute has found that the centralised rather than distributed facilities have proved

to be most cost effect in taking forward its research and currently has its own large server farm that

has undergone significant expansion in the last 12 months. This is also supplemented by other

virtualised systems and cloud computing resource to supplement these in house capabilities on an as

needs basis. In terms of the overall IT requirement, this is evaluated on weekly basis by a standing

committee.

(14)

14

2.5

External Perspectives

IBM, Microsoft and Hewlett Packard were asked to provide an input to the exercise from an external

perspective and as organisations centred on providing infrastructural IT solutions (such as cloud

offerings). The full responses from these organisations are given in Annexe E, raise interesting

observations on e-infrastructure, and support many of the trends identified through the RC, HEI and

WTSI responses

3.0

European and International Activity in e-infrastructures

3.1

At the 7

th

December meeting, the group requested that an overview of European and

international activity be included with this report to give a context to this review:

3.1.1

European Commission Activity

3.1.2

Funding for e-Infrastructure from the European Commission is delivered by the “Information and

Society” directorate. The major infrastructures all appear on the ESFRI

2

roadmap for large-scale

research infrastructures coordinated by the Research Directorate.

3.1.3

The e-Infrastructures activity, as a part of the Research Infrastructures programme, focuses on

ICT-based infrastructures and services that cut across a broad range of user disciplines. It aims at

empowering researchers with an easy and controlled online access to facilities, resources and

collaboration tools, bringing to them the power of ICT for computing, connectivity, storage and

instrumentation. This allows for instant access to data and remote instruments, "in silico"

experimentation, as well as the setup of virtual research communities (i.e. research

collaborations formed across geographical, disciplinary and organisational boundaries).

3.1.4

e-Infrastructures foster the emergence of e-Science, i.e. new working methods based on the

shared use of ICT tools and resources across different disciplines and technology domains.

Furthermore, e-Infrastructures enable the circulation of knowledge in Europe online and

therefore constitute an essential building block for the European Research Area (ERA).

3.1.5

The Communication from the European Commission on ICT Infrastructures for e-Science (COM

(2009)) puts in a context the relation between modern science and ICT-based infrastructures and

presents a renewed strategy for achieving leadership in Science, developing world-class

e-Infrastructures and exploiting their innovation potential.

3.1.6

The Digital Agenda for Europe initiative is one of the seven flagships initiatives of the Europe

2020 Strategy for smart, sustainable and inclusive growth. It recommends sufficient financial

support to joint ICT research infrastructures and innovation clusters, further development of

e-Infrastructures to be develop and the establishment of an EU strategy for "cloud computing",

notably for government and science.

3.1.7

Under FP7, the e-Infrastructures activity is part of the Research Infrastructures programme,

funded under the FP7 'Capacities' Specific Programme. It focuses on the further development

and evolution of the high-capacity and high-performance communication network (GÉANT),

distributed computing infrastructures (grids and clouds), supercomputer infrastructures,

simulation software, scientific data infrastructures, e-Science services as well as on the

adoption of e-Infrastructures by user communities.

2

(15)

15

3.1.8

UK engagement in the major EU activities includes:

• The JANET network is the UK link in to GEANT

3

; it is recognised that the benefits of the UK’s early

involvement in developing JANET were very great, and it is envisioned the benefits of having a

single means of connecting JANET to GEANT will lead to similar benefits in scientific collaboration

across Europe.

• The National Grid service with support from JISC acts as the UK national grid initiative partner in

the European Grid Initiative

4

, now ‘Infrastructure (EGI) – a new organisation, formally created in

Feb 2010, is based in the Netherlands and was established with grant support from the EC that

aims to co-ordinate the activities of the national grid infrastructures. The EGI also incorporates

activities formerly supported by the Commission under the EGEE project, which, in the UK, was

important for enabling particle physics to develop the necessary ICT for transporting, storing and

analysing data in readiness for the LHC.

• The Partnership for Advanced Computing (PRACE)

5

was formed in 2010 with grant support from

the EC. EPSRC on behalf of research councils is one of the founding members of the Association.

PRACE aims to coordinate the European investment in leading large-scale supercomputing

systems and provide access to these systems available to researchers across Europe. The EC see

this route as providing opportunities for industrial competitiveness as well as academic

excellence within Europe. In the initial phase of PRACE the UK participates as a “General” rather

than “Hosting” Partner. This decision to enter into PRACE at a lower level than comparable

European countries (Germany, France) was mainly due to the financial climate at the time the

decision to enter PRACE was taken, and the significant risk posed to continued access to HECToR

for UK based researchers at the full Hosting Partner level (Circa £100M over the initial 5y phase).

• The EU has funded an 18-month project to develop a European road map for exascale software

development (called EESI)

6

such that Europe can play a strong role in a developing international

initiative in this area. The activity is led by EDF in France with EPSRC as a member of the

consortium on behalf of researchers at the University of Edinburgh, STFC Daresbury and

Rutherford laboratories and Numerical Algorithms Group Ltd. Individual projects are also funded

in this area with UK universities as partners.

• The EU has provided preparatory phase funding to the European Life Science Infrastructure for

Biological Information (ELIXIR)

7

project looking at developing a biological sciences data

infrastructure. BBSRC are playing a leading role in this infrastructure.

3

http://www.geant.net/pages/home.aspx

- GEANT website;

4

http://web.eu-egi.eu/

;

http://www.egi.eu/

- European Grid Initiative and Infrastructure websites;

5

http://www.prace-project.eu/

- PRACE website;

6

http://www.eesi-project.eu/pages/menu/homepage.php

- European Exascale Software Inititative;

7

(16)

16

4.0

Summary and Identified trends of the consultation

4.1

In reviewing the responses, a number of common trends have been revealed:

• Network – The supra-exponential growth in data and the need to share this data for effective

collaboration within the UK and with International partners means that future Investment and

continued development of the JANET or and equivalent network infrastructure to service the UK

research base is seen by all respondents as being crucial in the provision of a balanced

e-infrastructure. Securing this is a priority.

• Software People and Skills – All respondents have noted the importance of robust and usable

software at every level of the e-infrastructure. In order to maintain and provide this, a steady

stream of software engineers and developers is necessary. However, although there are many

commercial application developers in the job market the skills required in a scientific context

require industry standard software engineering techniques, combined with the mathematical

and discipline based skills to interpret and implement algorithms in code. This type of developer

appears to be in very short supply.

• Data – This is seen as being THE big issue of the moment due to the explosion of its production

across all areas of science and the growing requirement to ensure that data be kept for years and

decades after production. Although this raises technical issues (as outlined in the WTSI and a

number of HEI responses), there are also funding and best practice implications. A number of

response have raised that there is a pressing need for further guidance and dialogue in this area

to provide a best practice frame work for now and a strategy for the future. Although there are

initiatives such as the UK Research Data Service and the Institutional Data Management

Blueprint, these do not appear to be connecting directly into funders and this communication

issue needs to be addressed to avoid duplication and increase the visibility of progress made to

date.

• Compute - there appears to be an ongoing need for HPC as a continuing strand of the

infrastructure on both a local and National level to facilitate the ongoing need to run simulations

that require a close coupled environment that cannot currently be done by other means, i.e. one

where the processing, memory and disk are highly and efficiently interconnected. Cloud

computing is starting to make a presence, and the HEIs and councils are fully aware of the

possibilities of this model in all its forms and have started making use of the commercial cloud

(e.g. Amazon EC2). It is highly likely that as both “Platform as a Service” and “Infrastructure as a

Service” become more mature and know quantities that this will fulfil some traditional HPC

requirements. The reasoning is well stated in the HP response to this consultation. However, it is

too early to quantify the degree of uptake in this usage mode. It is likely from the responses

received that cloud will, initially at least, make inroads where multiply accessed data sets

coupled to an elastic compute provision would be required. This model is already in operation in

the activities based at WTSI and EBI; further candidates could be communities in the ESRC and

NERC space.

Dr Dai Jenkins

Research Infrastructure Programme

EPSRC

(17)

17

Annexe A

E-infrastructure Advisory Group ToR

Rationale

There have been many reports and strategies produced recently that concern e-infrastructure and the

associated underpinning e-Science and several of these contain proposals for funding from the Large

Facilities Capital Fund. To identify the priorities for investment and coordination in the near, mid and

longer term it is proposed to establish a advisory panel to meet this autumn to provide advice to BIS and

the Research Councils on the elements of an e-infrastructure that should be taken forward during the

next spending review.

Terms of Reference for Advisory Group

To consider the outcomes of recent reviews and strategies – (e.g. International review of Science,

e-infrastructure review, Cross Council HPC strategy and the current national e-infrastructure review.) and in

the context of the Research Councils’ overall strategies and objectives for the next spending review

period, advise on the priorities for investment (capital and recurrent) in e- Infrastructure for research in

the period 2011-2015. The scope of this review will be to look at 'e-infrastructure for research'. This

should include:

• Consideration of the reports listed above

• Identifying priorities for investment by Research Councils, taking account of relevant Funding Council

strategies and the outcome of the CSR on 20 October

• Identifying whether current delivery structures are appropriate to deliver priorities identified and if

not, what might need to be changed

To report by 28 February 2011, jointly to RCUKEG and Professor Adrian Smith, Director General of Science

and Research in BIS

To be chaired by Paul Williams, Head of the Research Councils Unit in BIS

Proposed membership of the Advisory Group

There is an on-going review of JISC (reporting to HEFCE) and the Chair of this review should be contacted

to ensure alignment between the two. This review will take evidence from JISC, and other bodies where

appropriate.

The group was comprised of the following membership: :

Paul Williams, Department for Business Innovation and Skills (Chair)

Brian Collins Chief Scientific Advisor, Department for Business Innovation and Skills

Douglas Kell, Research Councils UK Executive Group

Lesley Thompson, Research Councils UK Research Group

David Sweeney, Higher Education Funding Council for England

Paul Hagan, Scottish Funding Council

Sheila Rodgers, Department for Employment and Learning Northern Ireland

Chris Hale, Universities UK

(18)

18

Annexe B

Institutions contacted to participate in the consultation exercise:

Organisation

Response

University of Southampton

R

University of Cambridge

R

Queen's University of Belfast

DNR

Cardiff University

R

University of Oxford

R

Imperial College London

R

University of Bristol

R

University of Edinburgh

R

University College London

R

The University of Manchester

R

University of East Anglia

R

Newcastle University

R

The Wellcome Trust Sanger Institute

R

EMBL - European Bioinformatics Institute DNR

IBM UK Labs Ltd

R

Hewlett Packard plc

R

Microsoft Research Ltd

R

JISC

R

BBSRC

R

EPSRC

R

ESRC

R

MRC

R

NERC

R

STFC

R

EDF

AR

Shell UK

AR

BP

AR

R = Received

AR = Apologies Received

DNR = Did not respond

(19)

19

Annexe C

Letter to invited HEIs

Dear XYZ,

I am writing to invite your organisation to contribute information to the recently formed e-infrastructure

Advisory Group. The deadline for submission of responses is Friday 14

th

January 2011.

Background

The e-Infrastructure Advisory Group is composed of representatives from Research Councils UK, the

Wellcome Trust, Universities UK and Funding Councils. Its aim is to develop a clear vision for the

development of the UK’s e-Infrastructure and to produce a corresponding multi-year framework for

delivery.

In considering the recommendations of the 2009 International Review of Science and RCUK

e-Infrastructure Review, the group has recognised that in order to ensure relevance and fit of the

framework, it will need to be informed from an institutional as well as a National perspective. In order to

achieve this, the Group invites your institution to provide information in the following key areas:

Current and future e-infrastructure need – What are your organisation’s current scientific drivers; to

what extent does your current e-infrastructure support or enable these drivers and what capability

(hardware and support) do you currently have in place? How do you see this landscape

developing/changing in future, where are the bottlenecks (locally and nationally), and what processes has

your organisation developed to identify, and quantify, your future e-infrastructure needs?

(Application/science drivers, increases in hardware capability (storage and compute),Local Area Network

capabilities (latency, bandwidth), skills and software development, economic and business case)

Organisational Strategy – How are e-infrastructures linked in to your organisational strategy, and what

delivery/governance structures are in place for the effective development and utilisation of new and

existing e-infrastructure in your organisation? (importance in delivery of mission, centralised or

departmental development strategies, sustainability models for investment, research data strategy

(usage and storage))

External interfaces and dependencies – Does your organisation currently link into any regional or

cross-institutional initiatives or alliances to share/develop e-infrastructure and what are the National

infrastructures/organisations/standards that need to be developed to enable your organisation to deliver

its research programme? (consortia and initiatives, JANET development (latency, bandwidth), security,

integrity, authenticity, resilience and availability)

In addressing the points above, please provide comments under the headings provided and covering the

requested detailed information using the italicised descriptors. Please could you send comments to

[email protected]

by the end of Friday 14

th

January 2011.

Yours Sincerely

References

Related documents