Initiative overview
08/19/11
Talk Outline
•
Background
•
Data rescue and databank
•
Multiplicity of data products
•
Common assessment through consistent
performance benchmarking
•
Data and product portal development
Background
•
Several independent global estimates of Land
Surface Air Temperatures
– Independently produced
– Gross agreement on all timescales
– Process metadata in some cases missing
– Much of the work was undertaken in late 1980’s/ early
1990’s - technology and expectations have changed since then
Examples of temperature reconstructions
IPCC Fourth Assessment Report (2007), Figure 3.1
Annual anomalies of global land-surface air temperature (°C), 1850 to 2005, relative to the 1961 to 1990 mean for CRUTEM3 updated from Brohan et al. (2006). The smooth curves show decadal variations … The black curve from CRUTEM3 is compared with those from NCDC (Smith and
Reynolds, 2005; blue), GISS (Hansen et al., 2001; red) and Lugina et al. (2005; green). See
ISI Congress, Dublin, August 26th 2011 5
Why the need to revisit?
• Global surface temperature records are key line of evidence • Backed by many other indicators (ice loss, humidity increases,
sea level increases etc.)
BUT:
• Link from station data to global averages is indirect (statistical
problem – reconstruction depends on methodology used)
• Data holdings are dispersed with poor provenance • Few user tools: access difficult for non-specialists
• Concerns over: evolving station networks; changes in
measurement technique / station location / recording practice (often undocumented); urban “heat island” effects; etc. etc.
• Increasing role of “blogosphere” & critiques outside
Inhomogeneities: annual mean min temp at
Reno, Nevada, USA
Issues not well understood …
•
Not all commentators / policymakers understand the
issues:
The big question
•
Can we create a process that leads to a suite of well
characterized estimates of land surface temperatures
that can be used to answer scientific and societal
demands of the 21
stCentury?
– Open and transparent
– Better understanding of fundamental instrument performance
and measurement properties
– Consistent performance evaluation – User tools
– Not just monthly at the largest scales. Daily, sub-daily, regional
How the initiative was started
• 2010 UK Met Office Submission to World Meteorological
Organization Commission for Climatology
– Call for creating new suite of products to meet 21st Century
demands / expectations
• September 2010 instigation workshop, UK Met Office, Exeter UK
– 80 international experts including climate scientists, metrologists, statisticians, software engineers
– White papers posted online and public comments solicited – Agreed project outline and governance structure
Databank Progress
•
Working Group instigated
–
Data rescue task team
–
Provenance and version control task team
•
Development version posted
–
http://www.gosic.org/
GLOBAL_SURFACE_DATABANK/GBD.html
•
First version release and accompanying
Stage 0 data in hardcopy / image format
•
Millions of rescued images in the NOAA Foreign
Data Library
•
2000+ boxes of data in the NCDC library
•
Holdings in other libraries and repositories,
particularly former colonial powers
•
Holdings literally rotting away or seen as a
nuisance in many countries
Data currently digitized
•
Numerous public holdings
–
NCDC
–
UCAR
–
RIHMI
–
Regional and national holdings
•
Some restricted for commercial or geopolitical
Stage 2 – common format
Stage 3
•
Same format as stage 2
•
One unique version for each station – recommended
version for most users
•
Forms basis for creation of analogs (see later)
•
Protocols used in merging sources are as yet to be
finalized
•
Provenance tracking will ensure an unbroken chain to
earlier stages
Partnerships essential
• Bring together existing efforts, augment and ensure pull through. e.g. ACRE
(“Atmospheric Circulation Reconstructions over the Earth”) project ( http://www.met-acre.org/), International Environmental Data Rescue Organisation (www.iedro.org) and other national / international programs.
• Pursue innovative approaches (crowdsourcing building upon success of oldweather.org etc.)
• Build on ICOADS model for sea surface temperatures (http://icoads.noaa.gov/) – easy
submission and access to data to recognized global repository
• Recognize key partners and contributions • And you can also help …
– www.surfacetemperatures.org/databank
– http://www.surfacetemperatures.org/databank/DataSubmission-Stage1-Guidance.pdf?attredirects=0
Multiplicity of data products
• Structural uncertainty is the key
– Raw data is far from traceable to international measurement standards
– Data artefacts are numerous and have myriad causes – Metadata is patchy at best
– Data is discrete in both space and time
– No “how to” … rather very many cases of “it may work …”
– Multiple subjective decisions required even in automated procedures (thresholds, periods, test type etc.)
• Need multiple independent efforts with different choices
– Station selection
– Time and space resolution – Quality control choices – Homogenization decisions – Averaging procedures
• Should not just be climate scientists as need broad range of
approaches
– Statisticians, metrologists, software engineers, citizen scientists etc.
• Distinct approaches pinpoint key uncertainties so redundancy is
of fundamental scientific value …
• … at least providing the underlying algorithm is understood
Benchmarking and assessment
•
With real world data we do not have the luxury of
knowing the truth – we cannot measure
performance of a specific method or closeness to
real world truth of any one data-product.
•
Instead focus on performance of underlying
algorithms
•
Consistent synthetic test cases, simulating real world
Benchmarking progress
• Working group and task teams instigated • Create c.10 analog-error-worlds
– Climate model basis (maintains plausible far field correlation
structure) tweaked with real station climate characteristics
– Add in random and systematic errors to approximate the real
world error structures which may exist
– Algorithm assessment based upon ability to recover original
data
– Error structures should enable answering a range of questions /
assumptions regarding the true error to avoid over-tuning
• Analogs to be made available Nov. 2012 based upon version 1
Benchmarking example
• For USHCN (lower 48 states) • Step changes and white noise
– 4 identical error structure on 2 20th Century runs, control and emission
halt runs from different models
– No errors (do no harm)
– Predominantly frequent small breaks (hard)
– Perfect metadata, large breaks (easy)
• 100 member perturbed ensemble of the NCDC pairwise algorithm • Consideration solely of timeseries and trends
• Analyses that follow are for predominantly frequent small breaks
Open issues on data product provision
•
Data formats
•
Degree of user interaction
•
Ability to create graphical and tabular output
on the fly
•
Limited progress to date
–
Largely a reflection that this data provision is some
way down the road?
Progress to date summary
•
Steering committee set up
– Terms of reference
– World Meteorological Office and The International
Environmetrics Society sponsored (seeking International Bureau of weights and measures)
•
Working groups on databank and benchmarking active
– Databank prototype made public and data sources coming in.
•
Implementation Plan published
•
Progress documented on initiative website at
You can help make this a success
•
Help to find raw data sources
•
Come up with novel ways of analyzing the
data
Background materials
•
Links to other relevant organizations (finite)
–
http://www.met-acre.org
/ - ACRE – data rescue,
reanalyses, end-user engagement
–
http://iedro.org
/ - IEDRO – data rescue
–
http://icoads.noaa.gov
/ - ICOADS – marine
databank
–
http://www.data-rescue-at-home.org
/ - data
Ecological Phenological Health & Disease Reinsurance Climate Monitoring Model Validation Environmental Assessments Extremes, Impacts & Risks Water resources Agricultural Forestry Energy Marine operations Fisheries Cultural landscapes and built heritage Education
Global historical reanalysis
56 realisations every 6 hours, 2⁰ x 2 ⁰ spatial resolution
O p e n P la tf o rm M e t O ffi c e -T S B -I B M P R E C IS D o w n sc a lin g H ig h e r R e so lu ti o n Citizen Science Crowd Sourcing (eg. Oldweather.org Data rescue@home) Global surface weather