Transcript Data Gaps
CGE Training Materials National Greenhouse Gas Inventories Addressing Data Gaps
Version 2, April 2012
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories
Target Audience and Objective of the Training Materials
These training materials are suitable for people with
beginner
to
intermediate level
knowledge of national greenhouse gas (GHG) inventory development.
After having read this presentation, in combination with the related documentation, the reader should: Have an
overview
of how to address data gaps Have a
general understanding
of the methods and tools available, as well as of the main challenges of GHG inventory development in that particular area Be able to
determine which methods
suits their country’s situation best Know where to
find more detailed informatio
n on the topic discussed.
These training materials
have been developed primarily on the basis of methodologies developed, by the IPCC
; hence the reader is
always encouraged to refer to the original documents
to obtain further detailed information on a particular issue.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 2
Acronyms
EF
LULUCF
Emission Factor Land Use, Land-Use Change and Forestry
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 3
Problems
What do we do when there are gaps in the data?
We only have data for 1995 and 2000.
We want to switch to a Tier 2 method, but we only have disaggregated livestock data starting last year.
The Energy Ministry stopped collecting data on natural gas flaring. What do we do?
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 4
Time Series Consistency
Inventories can help you understand emissions/removals trends.
These trends should be neither over nor underestimated, as long as can be judged.
The time series should be calculated using the same method and same data sources in all years.
In reality
: it is not always possible to use exactly the same methods and data for entire time series.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 5
Dealing with Reality
Data gaps may occur because
: A new emission factor (EF) or method is applied for which historical data are not available New activity data become available, but not for historical years There has been a change in how the EF is developed or activity data are collected… … or activity data cease to be available A new source or sink category is added to the inventory, for which historical data are not available Errors are identified in historical data or calculations that cannot be easily corrected.
These problems can be especially a challenge for agriculture and LULUCF sectors.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 6
Emission Factors
Recognize
: using a constant EF does not ensure time series consistency.
For some emission processes, emission rates may vary over time due to technological or other changes:
No: For stoichiometric processes Yes: For many biological and technology-specific processes.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 7
Data Availability
Changes and gaps in data:
More disaggregated or other improvements in data collection (e.g., better surveys in future years) Missing years or data no longer collected.
Periodic data:
Data collection only every few years or on regional rolling basis (i.e., each year a different region surveyed) Common for the LULUCF sector (e.g., forest inventory only done every five years).
No data?
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 8
Splicing and Gap-filling Approaches
Splicing
: combining or joining more than one method or data series to form a complete time series: Addresses a change in method (e.g., when Tier 2 method can only be applied to new data but Tier 1 is still used for historical data) Fills gaps due to the collection of periodically collected data.
Use surrogate or proxy data
to “create” data that are otherwise missing.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 9
Splicing and Gap-filling Approaches
Overlap Surrogate data (i.e., correlated proxy data) Interpolation/extrapolation Trend extrapolation.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 10
Overlap Approach
Calculate emissions or collect data using both old and new methods/systems for several years: Can be used with 1 year overlap, but should be done with great caution.
Investigate the relationship between old and new time series for years of overlap.
Develop a mathematical relationship and use it to recalculate historical data to be consistent with new methods/systems.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 11
Overlap - Consistent Relationship
In this example, it is acceptable to use the overlap adjustment.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 12
Overlap - Inconsistent Relationship
In this example, it is not acceptable to use the overlap approach because there is too much variability between the relationship.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 13
Overlap Approach
Where there is a consistent relationship, the default is to use a proportional adjustment of old estimates/data to be consistent with new.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 14
Overlap Approach
Example 1: Use the overlap approach to estimate GHG emissions for years 2001 –2003, using the data below.
Tier 1 quantified Tier 2 quantified
2001 2002 2003
4,000 4,000 4,100
2004
4,200
2005
4,800
2006
4,900
2007
5,000
2008
4,800
2009
4,900
2010
5,000 4,035 4,598 4,410 4,500 4,320 4,513 4,790
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 15
Overlap Approach
Example 1: Step 1
Tier 1 quantified Tier 2 quantified Ratio Tier 2 : Tier 1
2001
4,000
2002
4,000
2003
4,100
2004
4,200
2005
4,800
2006
4,900
2007
5,000
2008
4,800
2009
4,900
2010
5,000 4,035 4,598 4,410 4,500 4,320 4,513 4,790 0.96 0.96 0.90 0.90 0.90 0.92 0.96 For each year, calculate the ratio between Tier 2 and Tier 1 E.g. for year 2010: 4790/5000 = 0.96
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 16
Overlap Approach
Example 1: Step 2
Tier 1 quantified Tier 2 quantified Ratio Tier 2 : Tier 1
2001
4,000
2002
4,000
2003
4,100
2004
4,200
2005
4,800
2006
4,900
2007
5,000
2008
4,800
2009
4,900
2010
5,000 4,035 4,598 4,410 4,500 4,320 4,513 4,790 0.96 0.96 0.90 0.90 0.90 0.92 0.96 Calculate average and standard deviation Average = 0.93
Standard deviation = 0.027
Low variability Overlap approach seems appropriate
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 17
Overlap Approach
Example 1: Step 3
Tier 1 quantified Tier 2 quantified Ratio Tier 2 : Tier 1
2001
4,000
2002
4,000
2003
4,100
2004
4,200
2005
4,800
2006
4,900
2007
5,000
2008
4,800
2009
4,900
2010
5,000
3,713 3,713 3,806
4,035 4,598 4,410 4,500 4,320 4,513 4,790 0.96 0.96 0.90 0.90 0.90 0.92 0.96 Apply average to calculate missing data: Year 2001: 4,000 * 0.93 = 3,713 Year 2002: 4,000 * 0.93 = 3,713 Year 2003: 4,100 * 0.93 = 3,806
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 18
Overlap Considerations
Remember, it is crucial to have
multiple years of overlap to apply properly
This method should not be applied blindly
. You should do your best to understand the relationship between the old and new methods: E.g., why does the old method consistently give results that are 10 to 15% less than the new method?
If you cannot explain the difference then you are not sure that the new method is actually better!
Just because a method/model is
more complicated does not mean it is more accurate!
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 19
Surrogate Data Approach
Find a
surrogate
(i.e.,
proxy
) variable that is well-correlated with missing data: Can be used for missing activity data, for EFs (that change each year) or for emission estimates: Example: Automobile license payments may be well-correlated with petrol use. So license data may serve as a surrogate for petrol consumption.
This approach builds on techniques used in
statistical
(e.g., econometric)
analysis:
Regression techniques are valuable to identify potential surrogate parameter(s) Correlation analysis.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 20
Surrogate Approach Steps
Identify potential surrogate/proxy variables.
If you have some actual data, calculate simple
correlation coefficients:
You should have more than one year of actual data to establish a relationship with the surrogate parameter.
If the correlation is not obvious, then consider more sophisticated
regression techniques
to see if a relationship between actual and surrogate parameter can be found.
If you have
no actual data
, then you will
need to justify why the surrogate parameter is a legitimate proxy for actual variable(s).
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 21
Surrogate Approach
This formula assumes a simple proportional relationship between the surrogate and actual variables.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 22
Surrogate Approach
Example 2: Using number of vehicles as a surrogate, estimate CO
2
emissions for the variables below
.
Target variable
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
Road transportation CO 2 Known surrogate variable
2001
Number of road vehicles in circulation (000s) 3,520
2002
3,520
2003
3,60
2004
3,696
2005
4,224
2006
4,31
2007
4,400
2008
4,224
2009
4,312
2010
4,400 Vehicle-related data from different studies: •Transportation study 1 2009 CO 2 emissions for average car = 190 g/km, average km per year = 13,000 •Transportation study 2 2008 CO 2 emissions for average road vehicle = 4,410 kg CO 2 per year •Transportation study 3 2007 CO 2 emissions for average passenger vehicle = 220 g/km, average km per year = 16,000 •Transportation study 4 2008 freight vehicles are 5% of all road vehicles and emit on average 550 g/km.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 23
Surrogate Approach
Example 2: Step 1
Vehicle related data from different studies •Transportation study 1 2009 CO 2 emissions for average car = 190 g/km, average km per year = 13,000 •Transportation study 2 2008 CO 2 emissions for average road vehicle = 4,500 kg CO 2 per year •Transportation study 3 2007 CO 2 emissions for average passenger vehicle = kg CO 2 3520 per year •Transportation study 4 Average for all road vehicle is more 2008 freight vehicle are 5% of all road vehicles and emit on average 550 g/km Assess potential surrogate appropriate when focusing on road transportation emissions as a whole parameters Year km per annum average Average emission factor Average emissions per vehicle km/year gCO 2 /km kgCO 2 /vehicle
All road vehicles
2008 4,410
All passenger vehicles
2007 14,000 200 2,800
Cars only
2009 13,000 190 2,470 Additional data collection: Traffic study 5 Average km travelled per year by freight vehicles = 65,000 – 74,000 km [ 4410 – 2800 * ( 100% – 5%) ] / 5% = 70000 i.e.`, if freight vehicles on average travel 70,000 km per year, both data on “all road vehicles” and “all passenger vehicles” are accurate.
Consultative Group of Experts (CGE)
24 Training Materials for National Greenhouse Gas Inventories
Surrogate Approach
Example 2: Step 1 Use surrogate variable and parameter for calculation
Known surrogate variable
2001
Number of road vehicles in circulation (000s) 3,520
2002
3,520
2003 2004 2005 2006 2007 2008 2009 2010
3,60 3,696 4,224 4,310 4,400 4,400 4,410 4,450 Apply surrogate parameter (4,410 kg CO 2 /vehicle) and calculate emissions E.g. 4,224,000 vehicles * 4,410 kg CO 2 /vehicle / 1000 = 18,628,000 t CO 2 Target variable CO 2 emissions in thousand metric tons road vehicles emissions
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
15,523 15,523 15,911 16,299 18,628 19,016 19,404 19,404 19,448 19,625
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 25
Interpolation and Extrapolation
Interpolation
: Filling gaps in existing time series.
Extrapolation
: Filling gaps at end or beginning of time series.
Techniques
: Linear or nonlinear, justify choice Should not be used for variables that have large variability from year to year.
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 26
Interpolation Example
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 27
Extrapolation Example
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 28
Interpolation Example
Example 3: Using the interpolation technique, estimate the GHG emissions for years 2004 –2006
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
GHG emission source x 3,800 3,920 4,030 4,135 4,235 4,655 4,770 4,880 4,975
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 29
Interpolation Example
Example 3: Step 1
GHG emission source x
1999
3,800
2000 2001 2002 2003
3,920 4,030 4,135 4,235
2004 2005 2006 2007 2008 2009 2010
4,655 4,770 4,880 4,975 Analyze data and assess applicability and type of interpolation technique desired Linear interpolation seems appropriate for this data set
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 30
Interpolation Example
Example 3: Step 2
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
GHG emission source x 3,800 3,920 4,030 4,135 4,235 4,655 4,770 4,880 4,975 Calculate difference in GHG emissions between last year before the gap and first year after the gap 4,655 – 4,235 = 420
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 31
Interpolation Example
Example 3: Step 3
1999 2000 2001 2002 2003
GHG emission source x 3,800 3,920 4,030 4,135 4,235
2004 2005 2006 2007 2008 2009 2010
4,655 4,770 4,880 4,975 Calculate difference in GHG emissions between last year before the gap and first year after the gap 4655 – 4235 = 420 Calculate length of the gap 2007 – 2003 = 4 years
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 32
Interpolation Example
Example 3: Step 3
1999 2000 2001 2002 2003
GHG emission source x 3,800 3,920 4,030 4,135 4,235
2004 2005 2006 2007 2008 2009 2010
4,655 4,770 4,880 4,975 Calculate difference in GHG emissions between last year before the gap and first year after the gap 4655 – 4235 = 420 Calculate length of the gap 2007 – 2003 = 4 years Calculate average change in emissions per gap year 420 / 4 = 105
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 33
Interpolation Example
Example 3: Step 3
GHG emission source x
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
3,800 3,920 4,030 4,135 4,235
4,340 4,445 4,550
4,655 4,770 4,880 4,975 Calculate difference in GHG emissions between last year before the gap and first year after the gap 4655 – 4235 = 420 Calculate length of the gap 2007 – 2003 = 4 years Calculate average change in emissions per gap year 420 / 4 = 105 Calculate total emissions for gap year(s) by adding the average change per year 2004 emissions = 4235 + 105 = 4340 2005 emissions = 4340 + 105 = 4445 2006 emissions = 4445 + 105 = 4550
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 34
Splicing and Gap-filling Summary
Approach Overlap Applicability Comments
Data necessary to apply both old and new method must be available for at least one year, preferably more.
Only use when overlap shows pattern that appears reliable
Surrogate data Missing date is strongly
Interpolation Trend extrapolation
correlated with proxy data For periodic data or gap in time series Beginning or the end of the time series is missing data Should test multiple potential proxy data variables Linear or non-linear interpolation. Only use where data shows steady trend Only use where trend is steady and likely to be reliable. Should only be used for a very few years
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 35
Summary Comments
Preferred approaches
are
overlap
and
surrogate, because:
They are based on actual data Interpolation and extrapolation are effectively projections that assume certain trends in the absence of data.
Similarly in research,
it is not good practice to simply apply a gap-filling method blindly
: You should understand why your approach is justified and be able to explain it transparently.
Ask yourself: will what I am doing stand up to peer review in a technical journal?
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories 36
Thank you
Consultative Group of Experts (CGE)
Training Materials for National Greenhouse Gas Inventories