• No results found

2.2 Sources of Data and Construction of the Data Set

2.2.1 Regional Demand

The primary purpose of X, is to produce Z; primary purpose of Z is to produce T. Alternatively speaking, any production of T, creates a demand for Z. Any

production of Z creates a demand for X. Given the input-output relation, T production is the ultimate source of demand, hence ultimate source of demand variation. Therefore it is possible to capture regional demand fluctuations affecting

X, by incorporatingT into analysis. To this aim, around eachZ location, “relevant” T production over time is monitored; consequently, first source of data draws from T production.

Figure 2.2: Spatial Production Patterns in X, Z and T

Figure 2.2 illustrates the spatial relation between X, Z and T. Let ti refer to the

volume ofT production at each center. zi corresponds to the locations to whichX

is delivered (henceforth, X locations). The assumption employed here is zi = xi.

This means, consistent with its primary purpose, all the shipments to these locations have been used in the production of Z. In the figure, ri marks the limit distance

within which it is considered viable to supply Z from. It is initially set to ¯r, an industry rule of thumb.

Two defining features of T is the customer needs N, and the production process P. They together define the volume of T produced at each location, hence quantity of Z required, QD

Z. Production of T is subject to licensing. Licenses are provided

at the district level1. This includes monthly volume of T production, Qnpdt, in a

breakdown of customer needsn= 1,2, ...N¯, production processp= 1,2, ...P¯, month t= 1,2, ...T¯ and district l= 1,2, ...L.¯

Producers of T do not prefer to supply Z from providers that are far away. Alternatively speaking, onlyT production in a reasonable radius can create demand for each Z production. To identify “relevant” T locations that might be associated with each Z location, distance between Z locations and T locations should be retrieved.

T production data is at district level. Therefore, first, the coordinates of all districts are manually retrieved from Google Maps. Even though within a given district, T

1Perhaps a better term might be “district plus” as, data has information on districts and counties. According to Turkish administrative system, from specific to general, ordering of administration regions is neighbourhood, district, county, province.

production can be anywhere, distance is calculated in reference to a single point. As the first choice, the coordinates of a public building, e.g. municipality, hospital, post office, are used. If no public building is registered in the Google Maps in that district, a central location is arbitrarily chosen. Ambiguity about exact coordinates of T production inevitably causes some measurement error. However, since the districts are small, error is expected to be negligible. Second, coordinates of X locations are identified. Third, in connection with Google servers, a matrix of point to point driving distances between each district and each X location is constructed in an automated manner. After retrieving the distance information, building on nearbyT activity, three different regional monthly demand indices are constructed: For each X location, first measure is the aggregate monthly T volume that is in ¯

r units of driving distance; second measure uses 0.8¯r as the cut-off; third measure uses 0.6¯r.

Not surprisingly, the demand for T exhibits seasonality and is prone to calendar effects. To strip seasonal effects off, first option is using an indicator variable for each calendar month. Benefit of this approach is that fixed effects capture seasonality, not only from demand side, but also from supply side. However, there are two drawbacks: first, data is not long time wise; as it spans 18 months, only 6 calendar month indicators would be nonzero more than once. Second, employing calendar month fixed effects means forcing an additive seasonal structure on data. Second option is seasonally adjusting the demand data first and using adjusted data in the estimation. This allows experimenting with more flexible seasonal patterns. However, (i) any supply side seasonality in X production would not be captured, (ii) as the process would be sensitive to outliers, identification of seasonal patterns would be challenging with disaggregated data2.

Table 2.1: Seasonal Adjustment

n2p2 n1p2 n1p1 n2p1 Total

Log Transformed Yes Yes Yes Yes Yes

Model Fitted [(0,1,1)(0,0,0)] [(0,1,1)(0,1,1)] [(0,1,1)(0,1,1)] [(0,1,1)(0,1,1)] [(0,1,1)(0,1,1)]

Calendar Effects Yes Yes Yes Yes Yes

pvalue 0.0007 0.0000 0.0000 0.0000 0.0000

Seasonality (pvalue) ISNT ISP ISP ISP ISP

Stable seasonality (FS) 0.1809 0.0000 0.0000 0.0000 0.0000

Kruskal - Willis (W) 0.1028 0.0000 0.0000 0.0000 0.0000

Seasonality Assuming Stability (T) 0.1507 0.0000 0.0000 0.0000 0.0000

Evolative Seasonality (FM) 0.2107 0.3994 0.7459 0.2285 0.0000

Residual Seasonality (R) (F Stat) 1.43 0.31 0.64 0.71 1.22

FS No Seasonality. Monthly averages are not different W No Seasonality. Monthly averages are not different

H0 FM Seasonal effect across years is not different

T Seasonality is not present

T Residuals do not have seasonality

As a solution, data is stripped of seasonal effects at the aggregated level, and in proportion to each district’s weight in unadjusted data, aggregated adjusted data is disaggregated to districts. This works as follows: First, a weight ωnpdt¯ 2Potential differences in seasonal patterns across different customer needs N and production processP combinations make the task more challenging.

is calculated for each district, d, month, t, and n, p combination. This bases on i) seasonally unadjusted monthly district level volume, Qnpdt, ii) total monthly

unadjusted volume, P

dQnpdt¯ = Qnpt¯ , so that ωnpdt¯ =

Qnpdt¯

P

dQnpt¯ . Next, unadjusted

aggregated data is adjusted at aggregated level for each combination ofn, p, where n, p= (1,1),(1,2),(2,1),(2,2). After retrieving seasonally adjusted aggregate series

˜

Qnpt¯ , district level seasonally adjusted series are calculated via interacting district

weights with seasonally adjusted aggregate data, ˜Qnpdt¯ =ωnpdt¯ Q˜npt¯ . This process is

iterated, for three different radii; ri = ¯r,0.8¯r,0.6¯r.

Table 2.1 provides information about some diagnostics and seasonality tests. Since T information is publicly available, we are not restricted with the sample period. Seasonal adjustment builds on eight years of monthly data (96 months) of T production3. For the seasonal adjustment Demetra is used as the software. It compares alternative models and suggests the most suitable model to the data basing on information criterion measures. It also reports many diagnostic tests automatically which are informative about the nature of seasonality. As seasonal adjustment method, TRAMO-SEATS of Gomez and Maravall (1996) is used. ISNT stands for “identifiable seasonality is not present”, ISP refers to “identifiable seasonality is present”. The results indicate that the seasonal pattern, when present, follows a multiplicative structure, thus series are log transformed. In many cases anARIMAmodel of [(0,1,1)(0,1,1)]4 has the highest likelihood5. Residuals do not

exhibit seasonality, indicating the model does a good job in removing seasonality. Calendar effects are consistently significant across subgroups. Findings indicate that n2p2 is in stark contrast to other n, p combinations. Both Freidman test, which is

concerned with if subsamples in a sample are governed by the same distribution, and Kruskal-Wallis test, which compares subsample means, do not find evidence for seasonal patterns. Since there is no evidence for seasonality for Q22dt, series

have only been adjusted forn, p= (1,1),(1,2),(2,1). Final regional demand index, (tnplt), is an aggregation of seasonally adjusted indicesn, p= (1,1),(1,2),(2,1), and

unadjusted indexn, p= (2,2) into a single index.