International Surface Temperature Initiative progress report
January 2012
Authors: Peter Thorne, Jay Lawrimore, Kate Willett, Blair Trewin, Richard Chandler, Andrea Merlone
On behalf of the steering committee
1/31/12
This constitutes the first in an envisaged annual set of progress reports by the International Surface Temperature Initiative. The primary purpose of these reports is to update the
endorsing bodies of the initiative (WMO, TIES, BIPM) regarding progress against stated goals. Feedback from endorsing bodies is welcomed. This report will also be made available on the Initiative blog, and feedback there is also welcome from other interested parties and
stakeholders. Attached annexes contain detailed updates from the Initiative working groups on the databank and benchmarking and assessment completed in November 2011 , as well as the Implementation Plan.
SUMMARY
Overall progress
From a standing start in September 2010 the Initiative has grown substantially. A steering committee, two working groups and several task teams have been constituted and all are producing demonstrable outputs at the present time. The Initiative website and blogs have grown significantly and continue to be updated on a regular basis. The Initiative is recognised by WMO and TIES; formal recognition from BIPM has been sought and is in process. This recognition is key in ensuring legitimacy and engagement, and in providing feedback from relevant communities. Governance documentation has been formalized and accepted, and the governance working group disbanded as a result. An implementation plan has been
formulated by the steering committee and published on the website.
Several papers, web-‐based reports, and conference / workshop talks and posters have been given over the past year, including at major climate and statistics conferences. In early 2012 a keynote plenary talk will be given at a major decadal metrological thermometry conference along with a session of talks. These have helped to raise awareness of the Initiative and its aims amongst relevant communities. A paper outlining the overall Initiative aims was
published in the Bulletin of the American Meteorological Society in November 2011. (Thorne, P. W., K. M. Willett et al. (2011), “Guiding the Creation of a Comprehensive Surface
Temperature Resource for 21st Century Climate Science.”, Bulletin of the American
Meteorological Society, doi: 10.1175/2011BAMS3124.1). Similar papers are under revision for Environmetrics (statistics) and preparation for Metrologia (metrology) journals.
Thanks to significant ongoing efforts by the databank working group a large number of new data sources have been acquired for inclusion in an initial databank version release. The databank working group continues to be engaged in the development of the first version release being undertaken under the leadership of NOAA’s National Climatic Data Centre. The initial monthly databank version release is likely to consist of records from in excess of 30,000 stations globally with many consisting of maximum and minimum temperatures in addition to monthly averages. Many of the additional sources help to greatly improve coverage in the period prior to the 1950s. The inclusion of monthly averages derived from existing daily data holdings will significantly enhance post-‐1950 coverage above and beyond existing monthly databases. The working group progress report can be found in Annex 1.
The benchmarking and assessment working group have undertaken significant progress towards designing a methodological framework for their activities. By necessity the majority of their work is pending the release of the first version of the databank holdings which the benchmark analogs will ‘mirror’. The working group progress report can be found in Annex 2.
Efforts have been made to coordinate with relevant other activities as known to the initiative participants. The European metrological community meteomet initiative
(www.meteomet.org) has been engaged and letters of intent to collaborate have been
exchanged (Annex 4). The UK Natural Environmental Research Council EarthTemp initiative (http://www.earthtemp.net/) has also been engaged and members of the respective steering committees sit on each others committees to ensure collaboration. Members of the steering committee also attended and gave talks at the marine community MARCDATIII meeting to encourage a continuation of collaboration with the marine community surface temperature efforts.
Efforts directly or indirectly associated with the Initiative’s work have continued in this reporting period. The dataset being created by the Berkeley group, who attended the initiation workshop, has been submitted to peer review. While not yet to our knowledge accepted, it has been widely reported on in the media and the Initiative mentioned in a number of those stories. The European COST HOME action to develop new homogenization tools and benchmark test them has had its main results paper published in Climates of the Past. Efforts to benchmark the performance of the US record using the algorithm from NOAA’s National Climatic Data Center have also been published. Significant development of
alternative algorithms has accrued at NCDC and NIST and possibly elsewhere.
Significant issues
1. At time of writing, the envisaged working group charged with creating a data portal and user support is still not constituted. Although not currently critical this will become so and suggestions as to how to pursue this would be welcomed.
2. Concerns exist over the ability to get multiple independent groups engaged in the dataset creation (homogenization of the data to create climate products) problem. This is not something which we can directly control. Help in promoting the creation of new algorithms is required.
3. Despite numerous efforts to create a crowdsourcing digitization portal with the citizen science alliance, no funding has been accrued. It would require of the order 0.5 million US$ to create a portal and pull through to the databank for three years. Suggestions as to potential avenues to pursue would be welcome.
4. Implicit in much of the above the Initiative continues to function in a largely volunteer-‐ based capacity with in-‐kind support from some of the participants’ institutions. A more dedicated funding solution would help place the Initiative as a whole on a firmer basis.
Plans for the coming year
See also the implementation plan (Annex 3) for detailed plans and deliverables as well as the following section.
1. Release first version of databank holdings in the spring and associated code, documentation, and peer reviewed article submission. It is planned to undertake a media splash at such a time preferably through multiple databank working group and steering committee members’ press office’s to raise awareness and solicit additional contributions.
2. Release of first version of analogs and associated documentation in fall / winter. 3. Continued awareness raising through talks and posters at relevant events.
4. Efforts to address the significant issues raised above by dedicated members of the steering committee viz.:
a. Data portal working group creation – Kate Willett, Michael de Podesta, Jayashree Revadekar
b. Getting groups engaged in product creation – Richard Chandler, Antonio Possolo, Xiaolan Wang
PROGRESS ON TASKS DETAILED IN THE IMPLEMENTATION PLAN THROUGH 2012
1. Ongoing or periodic activities
Task: Regular Teleconferences
Main Contact: Peter Thorne Due Date: Ongoing Status: Ongoing Milestone: Regular discussions amongst members of the steering committee
Progress: Regular calls have occurred and been minuted on the web. In general agreed actions have been completed satisfactorily.
Issues: None
Task: Formal annual written report on Initiative
Main Contact: Peter Thorne Due Date: Jan Status: Done Milestone: Written by steering committee to sponsors and posted online
Progress: This document Issues: None.
Task: Formal written reports on working group progress
Main Contact: Jay Lawrimore / Kate Willett Due Date: Oct Status: Done Milestone: Written reports from working groups submitted to steering committee
for approval and posted online.
Progress: Done, see annexes to this document Issues: None
Task: Maintenance of website and blog
Main Contact: Peter Thorne Due Date: Ongoing Status: Ongoing Milestone: Materials updated and highlighted on a regular basis
Progress: All relevant materials have been posted and are up to date. Issues: Public engagement with the blog has been minimal to date.
Task: Promotion of Initiative through relevant meetings
Main Contact: Steering committee Due Date: Ongoing Status: Ongoing Milestone: Presentation to the science community through talks and posters.
Progress: The Initiative has been presented at multiple meetings this past year through talks and / or posters (note that this list includes specific tasks in this regard in the implementation plan):
• World Climate Research Program Open Science Conference (talk by Peter Thorne and 3 posters)
• International Statistical Institute meeting (talk by Richard Chandler)
• COST HOME meeting (talk by Kate Willett) • ACRE meeting (talk by Kate Willett)
• MARCDAT III (talks by Peter Thorne and Kate Willett) • GCOS steering committee (talk by Kate Willett)
• EGU (poster on benchmarking by Kate willett) • EMS/ECAC (talk by Kate Willett)
• 5th International Verification Methods Workshop (poster by Ian Jolliffe)
• ExIStA Kick-‐off meeting (poster by Kate Willett) Issues: None
Task: Engendering new dataset efforts
Main Contact: Steering committee Due Date: Ongoing Status: Cause for concern
Milestone: Exploit opportunities to promote awareness of the need for improvements to and diversity of algorithms, for example by organizing conference sessions and journal special issues and by lobbying funding bodies to support research in this area.
Progress: Limited to date. Peter Thorne has a funding bid with one agency pending a decision but the program may be cut.
Issues: Steering committee members have limited experience in this area and the funding environment is far from conducive.
Task: Up to date reference list of work on inhomogeneities in surface temperatures on website
Main Contact: Kate Willett Due Date: Ongoing Status: Ongoing
Milestone: To form a scientific basis for defining error model (analog) spread. Progress: Outline report drafted and some references collected and held on website Issues: None
Task: Advocacy of the benchmarks and support for users
Main Contact: Kate willett Due Date: Ongoing Status: Ongoing Milestone: All working group members should be encouraging use of the
benchmarks and providing support where necessary
Progress: Limited as benchmarks have yet to be finalized. Kate Willett has attended the final COST HOME meeting to attempt to raise awareness. Issues: None
Task: Advocacy of the databank, efforts to augment holdings
Main Contact: Jay Lawrimore Due Date: Ongoing Status: Ongoing Milestone: Efforts to augment data holdings
Progress: A total of over 30 distinct data submissions have been garnered to date to provide input to the databank. Many of these in turn combine multiple additional sources.
Issues: Any help that Initiative sponsors can bring to the table would be greatly appreciated.
2. Activities due for completion in the current reporting period
Task: Posting of initial databank data
Main Contact: Jay Lawrimore Due Date: 4/11 Status: Done Milestone: To be posted through GOSIC website.
Progress: Initial version of databank holdings was posted and continues to be augmented as new sources come in.
Issues: None
Task: Report from scoping discussions with zooniverse on crowdsourcing
Main Contact: Peter Thorne Due Date: 6/11 Status: Done Milestone: Report posted online
Progress: Report posted at http://www.surfacetemperatures.org/databank/data-‐ rescue-‐task-‐team/Crowdsourcingdigitization.pdf?attredirects=0
digitize the huge volume of primarily early period records on very economical terms. But we have yet to secure the order 500K US$ required to spin this up.
Task: Collation of ongoing efforts to digitize records.
Main Contact: Peter Thorne Due Date: 6/11 Status: Done Milestone: Posted to BADC website and maintained by Rob Allan
Progress: Posted and updated at
http://badc.nerc.ac.uk/browse/badc/corral/images/metobs Issues: None
Task: Benchmarking and Assessment Working Group terms of reference adopted. Main Contact: Kate Willett Due Date: 6/11 Status: Done
Milestone: Adoption and publication of the working group terms of reference. Progress: Completed
Issues: None
Task: Establish annual WWR updates
Main Contact: Jay Lawrimore Due Date: 6/11 Status: Done
Milestone: Determine pathway forwards following Congress response to key data priorities in May 2011
Progress: Was presented as an intervention during the WMO Congress. And subsequently voted in favor of this. WMO recommended that the 2001-‐ 2010 WWR edition be first completed and then begin addressing the annual updates.
Issues: No further progress will be made until 2001-‐2010 WWR is completed.
Task: Complete terms of reference for databank working group
Main Contact: Jay Lawrimore Due Date: 8/11 Status: Done Milestone: Adoption and publication of the working group terms of reference.
Progress: Completed Issues: None
Task: Position paper on version control and provenance available for public comment Main Contact: Jay Lawrimore Due Date: 9/11 Status: Pending Milestone: Posted and available for discussion on the blog
Progress: None to date, because the databank is still under development it has been decided to fold this into the databnak paper
Issues: Because it is dynamic presently it was decided to fold this into the main databank paper to avoid immediate dating.
Task: Prepare generic powerpoint and poster materials
Main Contact: Peter Thorne Due Date: 9/11 Status: Done Milestone: Materials posted online
Progress: Page has been set up at http://www.surfacetemperatures.org/generic-‐ materials and a number of resources posted
Issues: None although any feedback on these would be greatly appreciated.
Task: Paper in Environmetrics special issue
community, in a special issue devoted to current problems in climate (invited contribution).
Progress: Paper is accepted pending minor revisions at this time. Issues: None
Task: Metrologia article
Main Contact: Greg Strouse / Michael de Podesta Due Date: 12/11 Status: Pending Milestone: Initiative briefing paper to the metrology community submitted
Progress: Early drafts have been exchanged between authors
Issues: Significant time availability issues have precluded faster progress.
3. Progress against stated aims for the coming year
Task: ITS-‐9 participation
Main Contact: Peter Thorne Due Date: 3/12 Status: Pending
Milestone: Session at this major metrology conference on the International Surface Temperature Initiative (including at least four papers in the
proceedings)
Progress: There are a total of six initiative related talks lined up including a plenary talk by Peter Thorne. Papers are in preparation at time of writing.
Issues: None.
Task: Release of version 1 of the temperature databank (monthly and daily)
Main Contact: Jay Lawrimore Due Date: 4/12 Status: In progress Milestone: First version of stage 3 (merged) monthly and daily holdings made
available through gosic.org or another designated website
Progress: Significant progress in instigating a merge process has been made and we remain on track for a spring 2012 release at the present time. This release will include full process and data metadata including the code. Issues: Work remains to be done on any attendant media splash which we
would like to accompany the release and publicise the initiative and its work.
Task: Benchmarking position paper submitted for peer review
Main Contact: Kate Willett Due Date: 4/12 Status: Ongoing
Milestone: Submission to Journal of Atmospheric and Oceanic Technology, Journal of Geophysical Research-‐Atmospheres or Climate of the Past
Progress: 1st draft complete but awaiting finalisation of benchmark data creation methods and verification design
Issues: This is likely to be submitted in Autumn (October) as the methods for creating the benchmarks have not been finalised yet.
Task: Paper describing databank first version and underlying principles submitted to journal Main Contact: Jay Lawrimore Due Date: 8/12 Status: to be started
Milestone: submission to journal
Progress: None to date. Pending finalization of processing decisions. Issues: May slip if databank release is latent.
worlds (monthly data)
Main Contact: Kate Willett Due Date: 11/12 Status: To be started
Milestone: Release to public of analogs alongside the databank holdings in identical format.
Progress: Pending completion of databank release
Issues: Dependency upon timing of databank initial release
Task: Creation and release of official cycle 1 benchmark analog-‐known-‐worlds and analog-‐ error –worlds (monthly data)
Main Contact: Kate willett Due Date: 11/12 Status: To be started
Milestone: Release to public of analogs alongside the databank holdings in identical format.
Progress: Pending completion of databank release
Annex 1 Databank Working Group report
Databank Working Group
2011 Progress Report
October 2011
Current Members:
Jay Lawrimore (Chair) -‐ NOAA NCDC, USA
John Christy -‐ University of Alabama, Huntsville, USA Waldenio Gambi de Almeida -‐ CPTEC/INPE, Brazil
Koji Ishihara -‐ Japan Meteorological Agency Albert Klein-‐Tank -‐ KNMI, Netherlands
Matthew Menne -‐ NOAA/NCDC, USA
Matilde Rusticucci -‐ Univ of Buenos Aires, Argentina
Vyacheslav Razuvaev -‐ Russian Research Institute of Hydrometeorological Information
Madeleine Renom -‐ IFFC, Univ of the Republic, Montevideo, Uruguay Jeremy Tandy -‐ UK Met Office, Exeter, UK
Peter Thorne (ex-‐officio) -‐ CICS-‐NCDC, USA
Steve Worley -‐ National Center for Atmospheric Research, USA
Ex-‐Members:
Rod Hutchinson -‐ Australian Bureau of Meteorology Bryan Lawrence -‐ BADC, UK
New Members:
Meaghan Flannery -‐ Australia Bureau of Meteorology David Lister -‐ Climatic Research Unit, East Anglia, UK Albert Mhanda -‐ ACMAD, Niger
Jared Rennie -‐ NOAA NCDC, USA
October 2010 to October 2011 Objectives:
1) Invite members and establish working group communication including email lists, website and wiki.
2) Construct Terms of Reference for Databank WG.
3) Establish structure for a global temperature Databank and methods for data provenance. 4) Begin to populate Databank with Monthly and Daily timescale data.
5) Establish mechanism to support the collection of data; include data already digitized and non-‐digital through data rescue activities.
Objectives Met:
1) Invite members and establish working group communication including email lists, website and wiki.
Scientists from every WMO region were invited to join the Databank working group. Members participate on a completely voluntary basis and are responsible for providing leadership in identifying new sources of data and in providing expert guidance toward establishing and operating the Databank. Two individuals from Australia and the UK were replaced with other data experts from those countries, and there were additions later in the year from Africa and the U.S. The email list [email protected] was established to facilitated communications with all members between teleconferences which are held every 3 to 4 months.
A Databank Working Group page (www.surfacetemperatures.org/databank-‐working-‐group) is hosted on the ISTI website. This webpage provides background information on the purpose of the Databank and makes publicly available minutes of all teleconferences, documents that pertain to the development and operations of the Databank and its task teams.
The Databank WG launched a wiki to further facilitate communication between members and to serve as a reference for details on discussions and progress toward establishing the
Databank. This wiki is open to all members as a means for tracking progress. Postings to the wiki can be made by any member using a unique login and password.
http://editthis.info/intl_surface_temp_initiative/Main_Page .
2) Construct Terms of Reference for Databank working group
Databank working group Terms of Reference were written and agreed upon by all members. The TOR are hosted on the DWG website. The working group reported to the Steering
Committee, giving a verbal progress report at each quarterly phone call and a written annual progress report.
3) Establish structure for a global temperature Databank and methods for data provenance.
The DWG agreed to focus efforts on Daily and Monthly timescale temperature data in keeping with an overall 6-‐Stage structure. Other elements and timescales will be collected if made available but will not be the focus of this effort for the foreseeable future.
STAGE 0: Digital image and hard copy STAGE 1: Keyed in native format
STAGE 2: Converted into common format STAGE 3: Consolodated master database STAGE 4: Quality controlled derived products STAGE 5: Homogenized products
submission guidance and will be responsible for integrating these data back into the Databank as value added products.
The Databank is accessible from the Global Observing System Information Center (GOSIC) website (http://www.gosic.org/GLOBAL_SURFACE_DATABANK/GBD.html ), and it is directly accessible at World Data Center-‐A (ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/ ) or its mirror site (ftp://ftp.meteo.ru/pub/data/globaldatabank/ ) which was established at World Data Center-‐B Oblinsk, Russia.
The DWB established as a high priority an effort to establish data provenance for all data to the greatest extent possible. While the working group recognized that for some data there is little or no information on its origin or history, a foundation was established for providing traceability of data through all stages of the databank. A Data provenance and version control task team was established and asked by the full WG to develop methods that would provide provenance and ensure version control of the Databank. The task team established Data Provenance Tracking Flags as the primary mechanism for documenting provenance and ensuring traceability in a manner consistent with the procedures established for ICOADS, although applied in a manner that meets the unique nature of land surface observations.
Five (5) Data Provenance Tracking (DPT) flags are assigned to each observation within the Stage 2 data files. These flags provide information on the origins and types of Stage 0 and Stage 1 data. The 5 flags are: (1) Stage 0 Source, (2) Stage 1 Source, (3) Data Type, (4) Mode of Digitization, and (5) Mode of Transmission/Collection. Additional flags can be added as the need arises, and the information contained within each DPT flag can be expanded as
necessary to completely define a new type of observation.
This task team also investigated the potential uses for Unique Identifiers (UIDs), which are now being implemented in the ICOADS marine dataset. While the unique nature of ocean observations creates a need for the use of UIDs the team determined that they would not benefit the land surface databank. Additional information is available at
http://www.surfacetemperatures.org/databank/provenance-‐and-‐version-‐control-‐task-‐team .
4) Begin to populate Databank with Monthly and Daily timescale data.
Because of contributions from many members of the DWB and others throughout the
international community, the Databank was populated with numerous sources of data at both the daily and monthly timescales during the past year. Daily data are available in Stage 1 and Stage 2 formats for the following and as shown in Figure 1.
• Australia • Brazil
• Channel-‐islands • ECA/KNMI • Ecuador
• ISPD (International Surface Pressure Data) o IPY
o Swiss o Sydney
• Mexico
• Pitcairnisland • SACA/KNMI • Spain
• Uruguay-‐INIA • US Forts • Vietnam
Figure 1. Daily data submitted to the Databank by 21 October 2011, excluding data from the GHCN-‐Daily dataset or its 20 source datasets.
Monthly summary data are available in Stage 1 and Stage 2 formats for the following and as shown in Figure 2.
• Antarctica-‐SCAR-‐reader • Antarctica-‐South Pole • Arctic
• Australia • Canada • Central Asia
• Colonial Era Archive • East Africa
• HadCRU version 3 • Histalp
• Japan • ECA/KNMI • Russia
• UK Met Office historical • World Weather Records
Figure 2. Monthly data sources added to the Databank as of 21 October 2011. This includes daily data for which monthly summary data could be calculated.
Each source dataset is held in its own subdirectory and a README file accompanies the data in the Stage 1 directory, providing basic information on the data provider and the data. An INVENTORY file is included in each source subdirectory within the Stage 2 directory
structure. This file provides a list of each station for the particular source and other metadata including station id, name, location, elevation, and first and last year of data.
A limited amount of Stage 3 and Stage 4 data was added to the Databank on the Daily side. This was from the Global Historical Climatology Network-‐Daily dataset, which contains more than 25,000 stations with daily maximum and minimum temperature. This rich source of data is available for conversion to monthly summaries and future use on the Monthly side of the databank.
To aid data collection activities the DWG developed a Data Submission Guidance document which provides instructions for data providers. It provides a description of the process for providing raw native digitized (Stage 1) data to the databank and encourages data providers to follow basic guidelines which will ensure data are accurately and efficiently added to the databank with the fewest complications. Information on essential and recommended metadata is also included.
A cover letter was prepared to accompany the data submission guidance document. This letter provides a brief overview of the International Surface Temperature Initiative and the role of the global databank within the wider context of the grand initiative. The cover letter and guidance document are available on the databank website:
http://www.surfacetemperatures.org/databank .
Although those documents focus on data already available in digital format and most readily available, the DWG also established an effort to identify and collect sources of data through data rescue activities led by a Data Rescue task team. This task team includes members from multiple countries and data rescue activities.
The focus of this team is primarily on observations collected prior to the mid-‐twentieth Century and for which there likely exists as much data in non-‐digital form as exists in the current digital archives. The data rescue activities revolve largely around pulling through existing digitization efforts to the databank and attempting to ensure against redundancy of effort. This effort aims to leverage pre-‐existing programs where the credit for much of the data rescue clearly lies. An important area to capitalize on in 2012 will be the growing potential of Crowd Sourcing which has been used to great success by OldWeather.org in keying centuries old marine records. Tens of thousands of surface observation forms were imaged by the Climate Database Modernization Effort in the past 10 years. This provides a rich source of data that can be mined through global volunteer efforts aided by internet-‐based technologies.
Objectives Not Met: None
Other Efforts and Achievements:
The opportunity to communicate the goals and objectives off the Databank working group has been taken at several international meetings. This is now aided by the preparation of a poster on the Databank which is available for inclusion in poster sessions and other venues of
opportunity. Most recently this was included in a 4-‐poster display on the International Surface Temperature Initiative at the World Climate Research Programme Open Science Conference, which was held in Denver, Colorado, USA. This provided an opportunity to introduce many more people to the goals of the effort and led to the donation of several new sources of data.
A certificate of appreciation also is under development. This will convey to National Meteorological and Hydrological Services the gratitude of WMO’s Commission for Climatology, the Global Climate Observing System, and the World Climate Research
Programme for their support of the International Surface Temperature Initiative and broader efforts to meet 21st century needs for climate information.
2011 Annual Overview:
2011 was a year of successes for the global databank. From initial conception, a sound structure of design and implementation was begun. By the end of the year more than 20 sources of data had been collected and added to the databank as Stage 1 and Stage 2 data for both monthly and daily timescales. As the first year came to a close, discussions had begun for methods of data merging, and the coming year promises to be one of more successes as the DWG establishes a process for developing and launching the first version of the merged Stage 3 data. The working group has as its goal the launch of version 1 of the global databank in April 2012. Many things must happen before this occurs, but a continuation of the
contributions from all members of the DWG is sure to make this goal a reality.
Objectives for October 2011 to October 2012:
1) Continue to add sources of Daily and Monthly timescale data to the Databank. Work with DWG members and others in identifying and collecting readily available sources of digital data.
2) Build upon Data Rescue activities and leverage crowd sourcing efforts to begin volunteer digitization of land surface records.
3) Complete position paper on version control and provenance available for public comment. 4) Develop an approved methodology for merging sources of data to create a monthly Stage 3
data product. Include a hierarchy of source data from which to build the merged dataset. 5) Launch version 1 of the Databank for monthly timescale data in April 2012, making all
data, processes, and software freely available and accessible.
6) Complete and submit journal article describing version 1 of the Databank and its underlying principles.
Annex 2 Benchmarking and assessment working group
Benchmarking and Assessment Working Group
2011 Progress Report
October 2011
Current Members:
Kate Willett (Chair) -‐ UKMO Hadley Centre, UK Claude Williams -‐ NCDC, USA
Ian Jolliffe -‐ Exeter Climate Systems, University of Exeter, UK
Robert Lund -‐ Department of Mathematical Sciences, Clemson University, USA Lisa Alexander -‐ Climate Change Research Centre, University of New South
Wales, Australia
Olivier Mestre -‐ Meteo France, France
Stefan Brönniman -‐ University of Bern, Switzerland
Lucie A. Vincent -‐ Climate Research Division, Environment Canada, Canada Aiguo Dai -‐ Climate and Global Dynamics Division, NCAR, USA
Steve Easterbrook -‐ Department of Computer Science, University of Toronto, Canada
Victor Venema -‐ Meteorologisches Institut, University of Bonn, Germany David Berry -‐ National Oceanography Centre, Southampton, UK
Peter Thorne (ex-‐officio) -‐ CICS-‐NCDC, USA
Ex-‐Members:
Chris Wikle -‐ Department of Statistics, University of Missouri, USA Chris had too many other commitments and had to step down.
New Members:
David Berry -‐ National Oceanography Centre, Southampton, UK
October 2010 to October 2011 Objectives:
1) Invite members and set up the group including email lists, website and blogsite. 2) Devise a structure for creation of the Benchmarking cycle and set out a timeline for achievements and submit to the Implementation Plan
3) Construct a Benchmarking and Assessment working group Terms of Reference
4) Publicise the aims and objectives of both the ISTI and the work of the Benchmarking and Assessment working group widely and engage with as many similar efforts as possible 5) Design the concepts behind the benchmarks and begin to construct them
Objectives Met:
1) Invite members and set up the group including email lists, website and blogsite.
Venema, Robert Lund) and can provide expertise. The email list
[email protected] has been set up to communicate with all members. Teleconferences are hosted every 2-‐3 months.
The ISTI website now has a Benchmarking and Assessment Working Group page
(www.surfacetemperatures.org/benchmarking-‐and-‐assessment-‐working-‐group) outlining who we are and what we’re aiming to do. It also hosts the minutes of all teleconferences, important documents, listings of conferences attended and a timeline of deliverables.
We have a blog site (http://surftempbenchmarking.blogspot.com) that is open to all members to post threads and anyone to comment on threads. All members are invited to post threads at any time.
2) Devise a structure for creation of the Benchmarking cycle and set out a timeline for achievements and submit to the Implementation Plan
Four teleconferences have been held with at least 7 members in attendance for each. We have discussed: aims and running logistics; creation of the analog-‐known-‐worlds; overall concepts and homogenisation questions to address; and a plan of action. The concepts behind
benchmarking on such a large scale are complex and have taken a long time to formulate. A first draft of the concepts paper has been drafted outlining initial ideas behind analog-‐known-‐ world creation, error structure implementation to create the analog-‐error-‐worlds and
assessment methods. These may change as the appropriateness of these concepts becomes clear.
Three task teams have been established to govern the three components of the benchmarks. Team Creation is lead by Robert Lund and will coordinate the design and creation of the homogeneous synthetic analog-‐known-‐worlds. Team Corruption is lead by Claude Williams and will coordinate the design of error structures and software to add these to the analog-‐ known-‐worlds creating the analog-‐error-‐worlds. Team Validation is lead by Ian Jolliffe and will coordinate the design and creation of suitable assessment tools. All members have joined one or more groups. Kate has promised to be a very active member of all groups recognising that unlike other members she has some (10%) of her official work time allocated to this and ISTI related work. Email lists have been set up for each of these teams:
[email protected]; [email protected]; and [email protected].
A timeline for deliverables has been laid out in the Implementation Plan and is hosted on the BAWG website.
3) Construct a Benchmarking and Assessment working group Terms of Reference
A BAWG Terms of Reference has been written and agreed on by all members. It is now hosted on the BAWG website. The working group will report to the Steering Committee, giving a verbal progress report at every quarterly phone call and a written annual progress report.
4) Publicise the aims and objectives of both the ISTI and the work of the Benchmarking and Assessment working group widely and engage with as many similar efforts as possible The work of the Benchmarking and Assessment working group has been publicised at a number of conferences – all presentations are hosted on the website:
February 2011
Center (NC, USA): Devising a Benchmarking System for Homogenisation Methods of Climate Data-‐Products
April 2011
Kate Willett's Poster for EGU 2011: Robust Benchmarking of Homogenisation Algorithms for the Surface Temperature Initiative
May 2011
Kate Willett's presentation for MARCDATIII, Frascati, Italy, 2011: Is it Good Enough?
Benchmarking Homogenisation Algorithms and Cross-‐cutting with Efforts for Land Observations October 2011
Steve Easterbrook's poster presentation at the WCRP Open Science Conference, Denver, CO, USA: Benchmarking and Assessment of Homogenisation
Algorithms for the International Surface Temperature Initiative (ISTI)
Kate Willett’s presentation at the COST HOME 7th Seminar for Homogenisation and Quality Control of Climate Databases, Budapest, Hungary: Creating a Global Benchmark Cycle for the International Surface Temperature Initiative
Further mentions have been made within ISTI general presentations.
5) Design the concepts behind the benchmarks and begin to construct them
A first draft of the concepts paper has been drafted outlining initial ideas behind analog-‐ known-‐world creation, error structure implementation to create the analog-‐error-‐worlds and assessment methods. These may change as the appropriateness of these concepts becomes clear. Three task teams have been established to coordinate these efforts as described above. We recognise that all three are strongly linked and dependent on one another but that there is value in expertise being channelled to specific areas.
The basic concepts are in place. We now need to fine tune our decisions of exactly what to include, how, and to set up software to allow multiple productions of benchmarks with easily tweaked parameters.
Objectives Not Met: None
Other Efforts and Achievements:
Provisional acceptance of a PhD proposal to work on geospatial statistical methods in building daily benchmarks with Kate Willett (UK Met Office) and Prof. Trevor Bailey and Prof. Ian Jolliffe from Exeter University and with CASE NERC quota funding from the UK Met Office. NSF Research proposal incorporating benchmarking submitted through Clemson University, University of North Carolina and NCDC but rejected.
Potential for collaboration on a European daily benchmarking .
Standard talk and poster now prepared and presented – this will be updated as appropriate and made available on the website.
2011 Annual Overview:
2011 has seen the beginnings of the Benchmarking and Assessment Working Group. This has involved bringing together expert individuals and setting up a distributed working
much as others due to other commitments but the input of all members is highly valued and very much appreciated. However, despite these obstacles, there has been progress made in developing the framework of benchmarking. It should still be feasible to begin the benchmark cycle in November 2012.
Objectives for October 2011 to October 2012:
1) Benchmark Cycle concepts and plan formalised and submitted to JAOT or similar by April 2012
2) Design methods and create software for producing the analog-‐known-‐worlds ready for November release of pilot benchmarks and creation of the official benchmarks for the Benchmark cycle
3) Design methods and create software for producing the analog-‐error-‐worlds ready for November release of pilot benchmarks and creation of the official benchmarks for the Benchmark cycle
4) Design methods and create software for assessing the results of tests on the benchmarks ready for the Benchmark cycle
5) Create a platform for guiding users of the benchmarks in how to use them and how the assessment works
7) Publicise the aims and objectives of both the ISTI and the work of the Benchmarking and Assessment working group widely and engage with as many similar efforts as possible
Annex 3: Implementation Plan
Surface Temperatures Initiative Implementation Plan
Owners: Steering committee
Authors: Peter Thorne, Jay Lawrimore, Kate Willett
Version: 1.0 7/14/11
Executive Summary
The Surface Temperatures Initiative exists as an end-to-end process to facilitate creation of the best possible surface air temperature records over land to meet the myriad of data demands by science and society. The Initiative has strong international participation and representation from multiple relevant fields of expertise. It is supported through volunteer participation with no full time staff. To ensure focus the Steering Committee has enacted an Implementation Plan that sets out goals for the short- to medium-term in pursuit of Initiative aims. The Implementation Plan is structured around thematic areas and relies in addition to the work of steering committee members upon the actions of working groups and task teams. Actions are always identified with specific owners and time-bound.
The Initiative is currently in its first cycle (due for completion in 2014/5). Specific priorities for this cycle are as follows:
• To develop an initial version of the databank, coordinated by the Databank working group. This will build on experience gained from the ICOADS repository for in situ marine meteorological records. Initially the land surface databank will focus on surface temperature at monthly and daily timescales, with subsequent versions extending to all available variables at monthly, daily, and sub-daily resolutions. Wherever possible the databank will be traceable to the raw data record via supplementary metadata. The databank will be open access and version controlled. • Efforts will be made to exploit innovative techniques for the digitization of images and hard
copy archives, for example using citizen science crowdsourcing (e.g. oldweather.org). These efforts will interface closely with existing projects such as IEDRO (International
Environmental Data Rescue Organization) and ACRE (Atmospheric Circulation Reconstructions over the Earth).
• The Benchmarking and Assessment working group will define an initial collection of benchmark datasets, representing analogs of real observations corrupted by various noise models. Data-product creators will be encouraged to run their algorithms on the benchmarks. Such practices will enable users to cross-evaluate data-products and provide a tool for both quantifying structural uncertainty of and further development of homogenization algorithms. • The Steering Committee will appoint a working group to oversee the development of a
functional suite of tools for data visualization and product inter-comparison tools. This working group will be established after progress has been made on the initial databank setup along with the provision of some benchmark datasets.
• A formal reporting system will be put in place. The Steering Committee will report annually to overseeing bodies in meteorology (the World Meteorological Organization Commission for Climatology), statistics (The International Environmetrics Society) and metrology (International Bureau of Weights and Measures), with working groups reporting to the Steering Committee in advance. All Committee and working group meetings are will be documented and posted online. • The Steering Committee will promote the work of the Initiative to both expert and non-expert
1. Surface Temperature Initiative Background
The Surface Temperature Initiative concept, endorsed by the WMO Commission for Climatology at its 15th session, was launched at a meeting at the UK Met Office, Exeter in September 2010. To meet the requirements placed on climate science in the 21st Century, it is necessary to create a suite of high quality and high-resolution data-products, with openness, transparency, verification, and user tools. Such a range of estimates, and common framework, would aid decision-making at national and international scales and inform adaptation strategies. Crucially, this Initiative is envisaged to be international and interdisciplinary - involving climate scientists, statisticians, metrologists and software engineers from around the world. The Initiative should encompass: data rescue and digitisation; an open, transparent and comprehensive databank with versioning and provenance tracking; a data portal for multiple products estimating local, regional and global scale changes; a common benchmarking and assessment; and platforms for data download, intercomparison and visualization solutions. At the 2011 WMO congress the initiative was formally recognized.
2. Implementation Plan scope
This implementation plan (IP) has been written by the Steering Committee and will be updated biannually. It presents a medium-term vision of the implementation of this initiative covering the first full cycle of the databank and benchmarking exercise. It provides intermediate deliverables and activities to be undertaken by the Steering Committee, or by working groups answering to the Steering Committee and any sub-groups thereof. Currently all efforts towards the Initiative are essentially undertaken on a volunteer basis with no funding from any source directed towards explicitly supporting the project.
The IP focuses first and foremost on activities leading to completion of the first assessment cycle, envisaged to occur in 2014/15. It builds upon the principles agreed at the initiation meeting, held at the UK Met Office in September 2010 (details at www.surfacetemperatures.org), and summarized in a BAMS Meeting Summary which will be linked to once available in Advance Online
Publication. The dates and aims listed herein will serve as a roadmap and checkpoints with which to guide and gauge progress.
3. Databank preparation
Databank activities are undertaken under the auspices of the Databank working group. The
databank is envisaged to build upon the pioneering efforts initiated by a variety of organizations to construct a global repository for in situ land surface observations. It will leverage the experience gained from the ICOADS effort, which for more than 20 years has focused on the creation, maintenance, and development of a globally recognized single repository for in situ marine meteorological records, not limited to temperatures. Although the land surface databank will initially focus on surface temperature on the monthly and daily timescales, long-term goals are much broader. It is envisaged that eventually a successful land databank will consist of holdings of other essential variables at monthly, daily, and sub-daily resolutions. Wherever possible the
Figure 1. Proposed structure of the comprehensive land surface databank and products derived therefrom.
3.1 Databank hosting and structure
Stage Two records. Stages 4 and 5 describe climate data products derived from the databank by individuals and institutions and are not within the scope of the Databank working group.
The four stages of the Databank will be provided initially from a central repository at the Global Observing Systems Information Center (GOSIC). Stage Zero and One data, because they are provided by a variety of host organizations, will exist in a variety of formats. It is envisioned that Stage Two and Three data will be provided in ASCII and possibly NetCDF format along with software which will support accessibility to all users.
3.2 Version control and provenance
Recognizing the importance of retaining strong provenance tracking and version control, a dedicated task team has been established to develop recommendations and guidance to the
Databank working group on procedures which will ensure users have assurance of the reliability of the databank. Through data provenance, users will be provided with information that documents the history of all observations in the databank to the greatest extent possible. The path of each observation will be traced from the point of origin, through conversion and reformatting at each Stage. Although this will be the goal, it is recognized that many high quality observations already exist in global datasets for which limited provenance information exists. While it will be difficult to reconstruct the history of such observations, established procedures should ensure that the paths of all observations are documented and accessible to the greatest extent possible.
Version control is also recognized as essential to ensuring the traceability of changes made to the databank. In addition, version control relates not only to the data within the databank but also to the software used to create and process the data. All of these should be documented through a version control numbering system sophisticated enough to encompass routine updates to the databank, upgrades that are expected to occur throughout the life of a global databank, as well as unforeseen developments that may occur.
3.3 Recovery and conversion of non-digital data
Recent estimates suggest that there are comparable amounts of data yet to be digitized as are already digitized (Stott and Thorne, 2010). Much of this data has been imaged but never digitized. Millions of images exist and even more hard copy archives have yet to be fully cataloged and exploited. This inevitably constitutes a multi-year effort. Traditionally this has been done professionally at significant cost. Some initial efforts are being made to broaden the range of approaches including the use of citizen science crowdsourcing (e.g. oldweather.org, data-rescue-at-home.org). These and other mechanisms will need to be pursued to get the data digitized in a reasonable timescale. This effort will need to interface closely with existing projects such as IEDRO and ACRE to ensure against duplication.
3.4 Real time data exchange
The real-time and near real-time exchange of weather and climate data and information is made possible by the collective contributions of WMO Member nations through the Global
Telecommunication System (GTS). Although it has largely met the needs of the climate community for high quality monthly resolution data, as the need for daily and sub-daily data has developed over the past 20 years, policy, procedures, and technology have not kept pace. Although WMO