Functional linear model - Empirical results

3.4 Empirical results

4.2.2 Functional linear model

A functional linear model is utilised to predict the evolution of the implied volatility process. Classical linear models seek to describe the dependency between a response variable and a specied set of predictors. In classical regression, scalar values are used for both the explanatory and response variables. However, in functional linear regression at least one of the observed explanatory variables are curves. This means that functional analogs of classical linear regression coecients must be constructed. The procedure varies according to the model structure. Given that the explanatory variable adopted in our study is the implied volatility function we employ the use of:

1. Scalar response/functional explanatory which takes the formy=α+´β(m)˜x(m)dm+ ε(Hovarth and Kokoszka 2012) (scalar response model, henceforth)

2. Functional response/functional explanatory which takes the form y(m) = α(m) +

β(m, s)˜x(s)ds+ε (Hovarth and Kokoszka 2012) (fully functional model, hence-

forth)

The forthcoming sections discuss these two models in detail.

4.2.2.1 Scalar response model

We utilise the scalar response framework to nd the dependency between the current day, t, implied volatility function, x˜t(m), and the one-day ahead, t+ 1, implied volatility

scalar response for a particular contract, xt+1(mk): xt+1(mk) =α+

Ωm

β(m)˜xt(m)dm+εt,

whereΩm is the dened moneyness range, and whereβˆ(m)is found by minimising: T−1 X t=1  xt+1(mk)−α− ˆ Ωm β(m)˜xt(m)dm   2 . (4)

In classical linear regression, there must be fewer explanatory variables than obser- vations. Using a functional explanatory variable, however, acts as an innite-dimensional predictor of a nite set of responses. This means that an exact t is always possible, leading to ε= 0. It also means that an innite number of possibleβ(m) coecients will produce the same predictions. Dimension reduction through a basis expansion ofβ(m), as in Sec- tion 4.2.1, is proposed by Ramsay and Silvermann (2005) to solve this underdetermination issue. The smaller the number of basis functions, the smoother the estimate function_βˆ₍_m₎_. However, a low-dimensional basis may not be appropriate as it has the potential to omit important dependency dynamics. To allow for the use of a high-dimensional basis, _βˆ₍_m₎ can be smoothed to obtain an appropriate estimate for the continuum-varying coecient

β(m). This is done by imposing a roughness penalty which minimises deviations from

dm2βˆ(m) = 0. After incorporating the penalty, a smoothed βˆ(m)is found by minimising:

T−1 X t=1  xt+1(mk)−α− ˆ Ωm β(m)˜xt(m)dm   2 +λβ ˆ d2 dm2β(m) 2 dm,

where λβ is the weighting attributed to the smoothing penalty. Given that 5 and 95

represent the lower and upper bound delta values in the data set, we can dene our model as:

xt+1(mk) =α+

β(m)˜xt(m)dm+εt.

4.2.2.2 Fully functional model

We utilise the fully functional model as an exploratory tool only to assess the dependency between the current day, t, implied volatility function, x˜t(m), and the one-day

ahead,t+ 1, implied volatility function,x˜t+1(m). Given that both variables are expressed in terms of moneyness, we use the notation m and m0 to distinguish between the money-

ness domains of the current day, t, and the next day, t+ 1 implied volatility functions, respectively. We specify a fully functional model based on the historical linear framework proposed by Malfait and Ramsay (2003):

˜ xt+1(m 0 ) =α(m0) + ˆ Ωm β(m, m0)˜xt(m)dm+εt(m0) (5)

whereΩm contains the domain range of mover whichx˜t(m)is considered to inuence

xt+1(m0). We predict x˜t+1(m0) using the entire range of the x˜t(m) function, i.e., 5 to 95

delta.

In a similar view to the scalar response model, dimension reduction through a double basis expansion of β(m, m0), in terms of both m and m0, is used to solve the underde-

termination issue. The smaller the number of basis functions, the smoother the estimate function _βˆ₍_{m, m}0₎_{. However, two low-dimensional bases may not be appropriate as they} have the potential to omit important curve dynamics. To overcome this issue, Ramsay and Silverman (2005) apply an additional roughness penalty, to smooth in terms of both the range specied by m and m0. Weightings for the penalties are dened asλ1 andλ2, with the penalty being structured as follows:

λ1 ˆ ∂2 ∂m02β(m, m 0₎ 2 dmdm0+λ2 ˆ ∂2 ∂m2β(m, m 0₎ 2 dmdm0.

Given that the specied explanatory and responses are both curves, the resultant ˆ

β(m,m0)value takes the form of a 3-dimensional surface object, which we present in Section

4.5.

In order to assess how well the functional models t the data, functional versions of the widely employedR2 statistic and F-Ratio are applied:

R2(m) = 1−

PT−1

t (xt+1(mk)−xˆt+1(mk))2

PT−1

t (xt+1(mk)−x¯t+1(mk))2

where xt+1(mk) is the observed response, x¯t+1(mk) is the mean of the observed re-

F −Ratio= ( PT−1 t (xt+1(mk)−x¯t+1(mk))2− PT−1 t (xt+1(mk)−xˆt+1(mk))2)/(df−1) PT−1 t (xt+1(mk)−xˆt+1(mk))2/(T −df)

whereT is the number of days in the sample anddf is the equivalent degrees of freedom

for the t.

In document Cross asset class applications of functional data analysis: evaluation with controls for data snooping bias (Page 71-74)