NATIONAL COMPUTATIONAL INFRASTRUCTURE
PROFESSOR LINDSAY BOTTEN CSIT Building [108]
Director The Australian National University
Canberra ACT 0200 Australia T: +61 2 6125 9800 F: +61 2 6125 9805 E: [email protected] http://www.nci.org.au
Ms Clare McLaughlin General Manager
Research Infrastructure Branch
Department of Innovation, Industry, Science and Research GPO Box 9839
Canberra ACT 2601
Email: [email protected]
22 July 2011
Dear Ms McLaughlin
Re: Response to the Exposure Draft of the
Strategic Roadmap for Australian Research Infrastructure
Thank you for the opportunity to provide feedback on the Exposure Draft of the Strategic Roadmap for Australian Research Infrastructure. Our response commences with some comments of a general nature, focused on matters relating to eResearch infrastructure, followed by some particular remarks pertaining to specific requirements that have been referred to in the Exposure Draft.
Comments Pertaining to eResearch Infrastructure
Context
In general, in the years since the release of the first NCRIS Roadmap, the research communities and research institutions have made considerable strides in their integration and embedding of eResearch infrastructure and technologies into the practice of their research. This maturation, in quite a number of cases, sees institutions recognising the impact of eResearch on research outcomes and their missions, and taking consequent responsibility for this transformation in tangible ways—most notably in changed internal practices, and in investments that support these.
strategic planning. In this sense, the intention expressed in the Exposure Draft to “allow national capabilities to become comprehensive service providers of high‐end services” is welcomed, insofar as it flags a
significant shift towards the provision of vertically integrated services to key research communities by comprehensive/integrated service providers.
This vertical integration of services for targeted communities is an emerging reality today. For example, at NCI, integrated services comprising high-performance and data-intensive (cloud) computation, mass data storage and management, together with expert support and development are provided to the climate change, earth system science and environment communities, while individual components constitute a suite of generic services, valued by numerous research communities. While the need for vertically-integrated services is made in the Exposure Draft, it would be particularly helpful to the Australian research community if this point was to be amplified and articulated more strongly in the final version of the Infrastructure Roadmap. In the Exposure Draft, its significance and emphasis are understated and unnecessarily qualified.
While vertical integration aligns naturally with the NRIC principles of supporting national priorities, and research of excellence and impact, horizontal integration is not an effective path to the realisation of these NRIC goals. Indeed, it can be their antithesis, as past experience with untargeted, horizontal IT services has proven. The criticisms to date of the implementation of the eResearch infrastructure in part stem from the “horizontal”, infrastructure-focused framework that has been adopted, and also the effectiveness of the governance arrangements for these investments which have driven this “horizontal” perspective.
Prioritisation and Funding Issues
The Strategic Framework (pp 12–15 of the Exposure Draft) emphasises the importance of prioritisation in infrastructure investments to ensure alignment with national research priorities and areas of research strength (either extant or burgeoning), and to support requirements and outcomes that are nationally strategic. It follows that once priorities have been accorded to particular research foci, these priorities should flow through into the totality of infrastructure investments supporting these areas—in effect
underpinning the vertical integration of services that are required by research communities and the research institutions partnering such initiatives and (typically) providing co-investment.
While prioritisation of purpose has been a significant feature of investments under the Super Science program, the implementation of investment planning, at least in the area of eResearch infrastructure, has been patchy and inconsistent in this regard. Exemplifying this is the area of climate change and earth system science, in which the HPC investments have been strongly prioritised. However, the development of
associated data storage resources and high-value software services (tools and virtual laboratories) has not, necessitating competition for resources through the NeCTAR and RDSI programs (both of which have distinct and unprioritised project plans) in order to complete the infrastructure and services fabric.
We suggest that future infrastructure developments, particularly those that will involve significant
co-investment, will be strengthened by noting, in the Final Roadmap, the critical nexus between prioritisation of purpose and the establishment of vertically integrated services, and the need to ensure that investment planning accords with such an axiom.
eResearch Governance
With increasing responsibility being taken by a number of research institutions for the provision and support of eResearch infrastructure and methods in transforming and realising research outcomes, and with this trend likely to expand and accelerate, it is apposite for DIISR to review its approach to the oversight and governance of the eResearch investments that it makes. With the advice available to DIISR to date having advocated or endorsed the principle of horizontally integrated investments, there is a growing dichotomy between this approach and that of the vertical integration of services required by, and being pursued by, substantive collaborations among research organisations.
The financial framework for eResearch infrastructure and services is likely to evolve towards a balance of Commonwealth (DIISR) and institutional commitments, with this trend already in evidence in some capabilities. In such circumstances, the governance, advice and coordination mechanisms internal to the Department may require some refinement of that which currently exists. In particular, governance arrangements will need to reflect, and give encouragement to, the “vertical”, with Commonwealth contributions being seen as supportive rather than predominantly determinant. We thus suggest it is important that the Final Roadmap refer to a process of review for such matters.
Engagement of the Research Councils in Infrastructure Planning
While the development of the Roadmap is following similar procedures to those adopted in the original NCRIS investments, and also for the 2008 update of the NCRIS Roadmap, the absence of any formal involvement in this exercise by the national research councils is at odds with the guiding principles, particularly in the light of their direct engagement with the funding of national research priorities, the strength of their institutional engagement, and their mechanisms for identifying major research
infrastructure requirements nationally. This significant point was made previously in the March 2011 version of the Roadmap Discussion Paper and should be noted by NRIC and the Department as an important feature of ongoing, infrastructure planning.
Matters of Sustainability
While the matter of sustainability is of importance, in general, it is a particularly critical factor for high-end computational infrastructure, characterised not only by high capital costs but also by high recurrent costs. The implementation of both vertical and generic (horizontal) services that integrate contemporary HPC infrastructure is a special case that highlights the critical role being played by institutions which are taking substantial and increasing responsibility for sustaining both the infrastructure and the human-centric services required to deliver impact and outcomes. While sustainability was referred to in the March 2011 Discussion Paper, its omission in the Exposure Draft is a matter that we would urge DIISR to reconsider in the Final Roadmap, particularly since it focuses attention both on the substantial changes that have occurred in the research infrastructure landscape in recent years, and on the increasing involvement of research institutions in taking responsibilities for sustaining, planning and implementing high-end infrastructure.
investments—implicit in the adoption of a horizontal or infrastructure-focused implementation. As stated in our response to the Roadmap Discussion Paper, NCI subscribes to the view that eResearch is now an integral part of contemporary research methodologies, and that its development should be integrated into research “verticals”, becoming their responsibilities. Coordination between the “verticals” then becomes a natural requirement that exists to ensure appropriate access, the development of a comprehensive fabric without “holes”, and to ensure that infrastructure of genuinely national scale can be created.
If, however, the focus on enhanced coordination should lead to the establishment of an overarching governance layer controlling eResearch investments, not only would this be counter to the growing
responsibilities being taken by institutions for infrastructure development and sustainability, it would inhibit, and perhaps retard, progress in these directions. Steps should be taken to avoid this adverse outcome.
Facilitation of Investment Plans
The Exposure Draft refers to the development of investment plans for capabilities by a facilitation process which is referred to as a “structured consultation with relevant stakeholders at the capability level to articulate the specific investments, co-investment, recommendations as to location and operating entity, and the operating model for each capability”.
In framing a facilitation process, it is important for DIISR to acknowledge the maturity that now exists in a number of capabilities, the substantial changes in the infrastructure landscape that have occurred in recent years (relative to the situation in 2005-06 at the time of the previous NCRIS facilitation), and also the substantive role now played in particular capabilities by institutions in their planning, implementation, and sustenance. It follows that facilitation of the original NCRIS kind is unlikely to lead to successful outcomes for the eResearch investments.
Matters relating to Specific Requirements identified in the Exposure Draft relating to eResearch
Data
While the references to data in the Exposure Draft emphasise its capture, aggregation, transmission, storage access and re-use, it is not until the penultimate point on page 24 (of the bulleted list commencing on page 23) that there is any reference to “analysis, visualisation and data mining”. We suggest that a critical element for the discussion on data needs to be the value that is to be added to the data, not only through the
inclusion of metadata, cataloguing and curation, but also through computational transformation, analysis and searching techniques. Such requirements, particularly in the case of massive datasets, necessitate advanced computational infrastructure including high-performance parallel file systems, and substantial computational resources including cloud and supercomputing facilities. Conceiving the Google search engine not as a massive data repository, but a massive distributed computer in which the data is embedded may be an effective prompt for the critical value that computation adds to the utility and value of research data.
Reference to Exascale Computation
Within the text, exascale computation is referred as necessitating “a new breed of computational experts”. We suggest that this comment merits some qualification since, in the present circumstances, the mention of “the exascale” refers more to the journey, than to the destination. The path to the exascale passes through the delivery of research outcomes, impact and value at the petascale, the mid-petascale and the
high-petascale. The exascale is thus effectively four generations of infrastructure from that being implemented for 2012, with each element of this pathway being increasingly dependent on expert human capital.
Such expertise needs to be developed not only in the infrastructure layers, to advance research outcomes from current and next generation infrastructure, but also in the research capability of computational science from which new mathematical and numerical methods, new algorithms, and new computational
approaches, informed by the needs of nationally critical applications, will be required to solve research problems of growing complexity on increasingly parallelised architectures.
Computational Science
The break-out box on Computational Science (page 28 of the Exposure Draft) provides a welcome highlight for this important capability which, as is stated, is providing a very important third route to knowledge discovery. We would suggest, however, two changes to the approach that has been adopted in highlighting the importance of computational science.
Firstly, we suggest some clarification of the “integrated approach” of eResearch infrastructure and the fabrication capability. Specifically, the reference to fabrication in the Australian context is somewhat oblique.
Secondly, the practice of computational science needs to be linked with a research dimension, in addition to the infrastructure dimension referred to in the Exposure Draft. The critical feature of this is the integration of research capabilities in computer science and computational mathematics (driven by national strategic applications) with development capabilities that would be provided within the infrastructure/services layer.
Concluding Remarks
The translation of the Roadmap into a Budget submission will require depth, detail and justification. Regrettably, the description that is provided for many of the capabilities lacks these attributes, presumably as a consequence of trying to express complex requirements in limited space. If the value of the Final
Roadmap is to be realised, matters of detail and the subtleties of complex and critical underlying issues need to be addressed in the paper, with this perhaps requiring the relaxation of what appear to be artificial strictures on the length of the document.
Yours sincerely
Lindsay Botten
Director, National Computational Infrastructure
Cc: Emeritus Professor Mark Wainwright, Chair, NCI Board