At the time policy actions are taken (in terms of selecting the level of rt) the central
(2002) point out that in general if one introduces small measurement (observation) errors, a policy based on a signal may continue to be optimal in sense that the deviation from the perfect information case in terms of the resulting welfare loss caused by observation errors is negligible (see equation (5.22) of chapter 5, where the interest rate rule responds directly to the signal on private sector expectations). With large measurement errors, however, the policy rule that reacts to the available signal about ˆEtRt+1 can be very costly. In that case the central bank may choose
to get a better forecast of ˆEtRt+1 by running its own regression. In chapter 5, we
saw that the central bank understands that private sector expectations take the following form:
ˆ
EtRt+1 = γwt (6.1)
In practice the main problem for a central bank is that private sector agents– finan- cial analysts, investment bankers etc.– have become more and more sophisticated in analyzing and predicting monetary policy actions of the central bank. This in- creased degree of sophistication of the private sector makes it harder for the central bank to observe expectations with a good degree of precision. In fact observation errors on private sector expectations could be high.1
Since the central bank understands the forecasting rule of the private sector, sub- stituting (6.1) into the term structure equation and in turn the IS and Phillips equations gives the actual law of motion for wt+1 ≡ −(πt+1+ zt+1) (see chapter 5).
In the previous chapter, we saw that the transmission mechanism is such that at time t, controlling the variability of πt+2 is equivalent to controlling the variations
in wt+1, as the two differ only by an additive shock to inflation two period ahead.
We remember that the actual law of motion of wt+1 is given by:
wt+1 = ˜γwt− β1zt+ ˜βrt+ νt+1 (6.2)
where ˜β ≡ β2(1 − k) > 0 and ν is a composite shock, that is a linear combination of
the structural shocks to inflation, output and the term structure. The persistence parameter ˜γ ≡ 1+β2kγ > 0 is unknown to the central bank because of the unknown
parameter γ (see chapter 5 for details).2 Actually, (6.2) is the constraint that the
central bank faces in setting rt optimally and, ultimately, what matters for policy
is an estimate of ˜γ, denoted by ˜ct, based on (6.2). Note that as in the learning and
control literature, the constraint (6.2) also serves as a regression model to estimate the unknown persistence parameter.
1Since potentially γ could be time varying, it may be appropriate to have γ
twith assumptions
about its time series properties. Following Beck and Wieland (2002) it is possible to consider a random walk process.
2Note that under perfect knowledge, given these positive parameters, the central bank lowers
When active learning by the central bank is involved, the optimization problem usually gets complicated as the number of state variables increases (due to what is known as the ”curse of dimensionality”). In our case, besides the state variable
wt, the current output gap, zt, also appears as a state in equation (6.2). Together
with the beliefs about the mean and variance of the persistence parameter, there would be four state variables in the model. The dual control literature usually uses a constraint similar to (6.2) but with β1 = 0 so that next period’s state variable, wt+1, depends on the policy instrument, rt, and possibly lagged values of the state
variable; see Beck and Wieland (2002). In our case, setting β1 = 0 reduces the
number of state variables to three. This makes the dynamic equation comparable to that considered by Beck and Wieland (2002):3
wt+1= ˜γwt+ ˜βrt+ νt+1 (6.3)
Note that in the model of chapter 5, if β1 = 0, the perfect knowledge, rational
expectations solution gives ˆEtRt+1= (β1/β2)wt= 0. This is inconsequential to the
way learning is modeled under imperfect knowledge. The reason is that from the point of view of the central bank, the incentive for learning arises from the fact that
γ is some unknown parameter, and not from its specific value. In any case, the
motivation for central bank learning comes from the knowledge of the functional form of (6.1), where in the case β1 = 0, the central bank would be assumed to lack
information about the true γ being actually zero.
As before, when period t + 1 arrives, the central bank updates its estimate by including the latest available data (wt+1, wt, rt) in the regression equation (6.3).
Using the widely used method of recursive least squares we have the following updating equations for ˜ct and its variance, denoted by ˜pt (see for e.g. Beck and
Wieland, 2002; Pollock, 2002).4
˜ct+1 = ˜ct+ κt+1(wt+1− ˜ctwt− ˜βrt)
˜
pt+1 = ˜pt− κt+1wtp˜t (6.4)
where κt+1 ≡ wtp˜tFt+1−1 is commonly referred to as the Kalman gain, which is the
weight assigned to new information coming from the forecast error wt+1−˜ctwt− ˜βrt.
The Kalman gain in turn depends on the conditional variance of wt+1, (based on
information in period t), given by Ft+1≡ w2tp˜t+ σν2. As can be seen from (6.4), the
current state of the economy, wt is part of the Kalman gain, and affects the path
of the conditional variance of the parameter estimate, ˜pt+1. Due to the presence of
autoregressive behavior in wt+1, changes in the current state of the economy, wt,
3In Beck and Wieland (2002), the only unknown parameter is by assumption ˜β, which is
multiplicative to the control variable rt.
4If the unknown parameter is time-varying, the updating equations can be modified to allow
for this variability via the Kalman filter; see for instance Sargent (1999) and Beck and Wieland (2002).
have direct effects on the variability of wt+1, and consequently on the variability of
wt+2, wt+3, ....5
The updating equations in (6.4) capture the idea that central bank learning pos- sibilities about the unknown parameter is influenced by policy decisions made in period t, rt. The channel work as follows: rtaffects the state variable in period t+1,
wt+1, and consequently, beliefs about the unknown parameter (i.e. ˜ct+1 = c(wt+1)
and ˜pt+2 = p(wt+1)). The link between current and future policy choices is estab-
lished because expected future interest rate decisions by the central bank depend on the expected future state of the economy. From the principle of least squares estimation the precision of the estimate ˜ct+2 depends positively on the variance of
wt+1. One gets a more precise estimate (in other words, a smaller value of ˜pt+2)
when the variance of wt+1 increases, and vice versa. Since we recognize that the
current choice of monetary policy rt affects Etwt+1, and given predetermined Ft+1,
the coefficient of variation, defined by the ratio√Ft+1/Etwt+1 is also a function of
rt. This relationship between rt, wt+1 and ˜pt+2 raises a potential tension between
the urge to minimize current period loss from variability in rt (the control part)
and the need to get a more precise estimate of the degree of persistence in the economy that would help improve future outcomes (the learning part).
Under flexible inflation targeting, the central bank conducts monetary policy by adjusting the nominal interest rate, and with predetermined inflation expectations, perfectly controls the short-term real interest rate.6 The central bank chooses {rτ}∞τ =t to minimize the discounted sum of expected current and future losses, (see
chapter 5). min {rτ}∞τ =t Et ∞ X τ =t δτ −tLτ (6.5)
where Lt = 12(w2t + λrt2), λ > 0 is the relative weight assigned to the loss from the
variability in rt, the control variable.7 As before, the discount rate is denoted by δ,
(0 < δ < 1). The expectations operator Etrefers to the central bank’s expectations
conditional on information set in period t. Note that when λ → ∞, the central bank engages in full stabilization of rt, i.e., rt = 0 for all t.8
In a strict inflation targeting regime with passive learning, the central bank mini- mizes the inter-temporal loss function (6.5) setting λ = 0 and taking into account
5The assumption that the shock is normally distributed with known variance is standard in
the learning literature. If the prior belief also have a normal distribution, then the posterior belief is a normal distribution. This property of the posterior belief is convenient when dealing with numerical computations (see Appendix B).
6Of course the central bank directly controls the nominal rate, but because of the control lags,
the one-period ahead expected inflation is predetermined, so that the central bank has perfect control of the real short rate.
7This follows the learning and control literature. See Beck and Wieland (2002) for a detailed
discussion.
8In chapter 5, section 5.10, we did a simulation to see the consequences for parameter learning
the linear dynamic equation (6.3). The optimization problem would give solutions for rt analogous to the previous chapter, as the problem is solved period-by-period
and the first-order condition sets the conditional expectation of inflation two-period ahead equal to the desired target. As we will see below, this is no longer true with flexible inflation targeting (λ > 0) as future losses and the discount rate play a role in the choice of the current policy rate.