Transcript Document

Statistical Tools for Solar
Resource Forecasting
Vivek Vijay
IIT Jodhpur
Date: 16/12/2013
Outline
• Solar Resource Assessment
• Types of Data
• Regression Analysis – Modeling of Cross Sectional Data
• Statistical Tests
• Dimensionality Reduction
• Time Series Forecasting
• Learning Algorithm - ANN
Solar Resource Assessment
Solar Resource Assessment (SRA) is a characterization of solar
irradiance available for energy conversion for a region or specific
location over a historical time period of interest.
Forecasting solar irradiance is an important first step toward predicting
the performance of a solar-energy conversion system and ensuring
stable operation of electricity grid.
PV plants are fairly linear in their conversion of solar power to
electricity, that is, their overall conversion efficiency during operation
typically changes less than 20%.
On the other hand, assessment of CSP production is more challenging
due to the non-linear nature of thermodynamic parameters.
Types of Data
Cross Sectional Data
Multiple individuals at the same time
Time Series Data
Single individuals at multiple points in time
Panel or Longitudinal Data
Multiple individuals at multiple time periods
Regression Analysis
Problem – Estimation of Global Solar Radiation from
Meteorological Parameters (Air temperature, relative humidity etc.)
and Sunshine Duration
Angstrom-Prescott Model – A linear regression model (Monthly
average daily radiation at a particular location (H) v/s Monthly
average daily sunshine hours (S))
𝐻
𝑆
= 𝑎 + 𝑏.
𝐻0
𝑆𝑚𝑎𝑥
𝐻0 and 𝑆𝑚𝑎𝑥 can be obtained by using some other parameters.
Statistical Test
• The accuracy of the estimated models must be judged by
statistical indicators, such as
• Correlation Coefficient
• Mean Bias Error
• Root Mean Square Error
• Percentage Error
• Coefficient of Determination
Dimensionality Reduction
The dimension of the data is the number of variables that are
measured on each observation. When the dataset is highdimensional, not all the measured variables are “important”. The
analysis also becomes computationally expensive. The removal of
“irrelevant” information is dimensionality reduction.
Given the 𝑝 dimensional random vector 𝑋 = (𝑥1 , 𝑥2 , … , 𝑥𝑝 ), the
problem is to find a lower dimensional representation of it,
𝑆 = (𝑠1 , 𝑠2 , … , 𝑠𝑘 ) with 𝑘 ≤ 𝑝 that captures the information in the
original data, according to some criterion.
Dimensionality Reduction
The techniques of dimensionality reduction are mainly
classified into
(a) Linear (PCA, Factor Analysis etc)
(b) Non-linear (Kernel PCA, MDS, Isomap etc)
Linear techniques result in each of the 𝑘 ≤ 𝑝 components of
the new variable being a linear combination of the original
variables
𝑠𝑖 = 𝑤𝑖,1 𝑥1 + ⋯ + 𝑤𝑖,𝑝 𝑥𝑝 ,
𝑖 = 1, … , 𝑘
Time Series Forecasting
• Linear Time Series Models (Under Stationarity)
• Simple Autoregressive (AR) Models
• Simple Moving Average (MA) Models
• Mixed ARMA Models
• Seasonal Models
• AR (1) model
𝑥𝑡 = 𝜑0 + 𝜑1 𝑥𝑡−1 + 𝑎𝑡
Where {𝑎𝑡 } is assumed to be a white noise series with mean zero
and constant variance.
Some Measures
• Order Determination of AR
• Partial Autocorrelation Function
• AIC or BIC
• Parameter Estimation – Any AR(p) is similar to multiple
regression model and so least square method can be used to
estimate the parameters.
• Goodness of Fit
𝑅2
𝑅𝑆𝑆
=1−
𝑇𝑆𝑆
A Learning Algorithm - ANN
• Artificial Neural Networks – When the data is non-linear in
nature, ANN is a good methodology for forecasting. The
gradient decent algorithm can be used for updation.
• Issues
• How many number of hidden neurons?
• How many number of hidden layers?
• Overestimation
Thank You