Data description - Data Description and Hypotheses

6.3 Data Description and Hypotheses

6.3.1 Data description

All data were obtained from the TV channel VOX, which broadcasts “The Perfect Dinner,” by collecting data from their website and re-watching episodes to obtain missing data3_{. The ratings for the show during the time period investigated (2006-}

2011) were kindly provided by VOX. We excluded the first 24 weeks of our dataset for estimating all but one specification of the econometric models, where we test for the effect of the initial 24 weeks on voting behavior. They are excluded as they represent the time span between the filming and the broadcast of the first show. Since one of the factors we want to measure is the impact of past evaluation behavior, we must exclude this time span in all but one specification because there is no possibility for contestants to have any knowledge about previous voting behavior prior to the first broadcast. Without the first 24 weeks, the final sample consists of 3735 cooking assessments observed in 237 rounds of games. Including the first 24 weeks leads to 4322 observations.

Table 6.1 gives an overview of the descriptive statistics for the variables used in the econometric analysis, discussed in more detail in section 4. The points given to the cook by the other contestants are set as a dependent variable (“Points”). Its distribution is skewed to left which is due to the very rare occurrence of poor evaluations. The mode and median both take the value of eight and hence are rel- atively close to the highest possible value. Table 6.2 displays the distribution of “Points” for the estimation sample. The weekday (Monday-Friday) the contestant to be evaluated is performing enters the econometric model as a set of four binary indicators each indicating one weekday (Tuesday-Friday). Monday is the base cat- egory and therefore does not appear in the table. Since the time schedule does not vary across rounds, the weekday and the order of cooking are equivalent. That is cooking on Monday implies cooking first, cooking on Tuesday implies second, etc. “Evaluator Already Cooked” is equal to one if a participant has already cooked at the time he must evaluate a dinner and equals zero otherwise. “number of ingredients” represents the absolute number of ingredients used in a diner. “number of

3_{Part of our data set is available in a disaggregated form on the German TV channel VOX broad-}

casting homepage of “Das perfekte Dinner”:http://www.vox.de/kochen/das-perfekte-dinner/ details.

ingredients2_{/100” is the square of “number of ingredients”, divided by 100. This}

factor of division is used to ensure that the explanatory variables are numerically of similar magnitude, rendering the estimation procedure more stable and the coefficients more easily comparable. “Level” measures the difficulty level, “price” the price level of menu4. “Av. evaluation level,” reports the average points given during the last 24 weeks before a contest. “Av. share viewers” represents the tv-market share of the respective show. The minimum equals 7.85%, while the maximum is 12.23%. For example, if 10 million people were watching TV on Monday during the airing of “The Perfect Dinner”, a 10% share implies that 1 Million people were watching the show. A five percentage point difference might not seem large, but for a show in the relevant market segment, this represents the difference between a mediocre and a successful show. “Population” measures the size of the population in the town in which the show is being filmed in millions of inhabitants. For example, the largest city in our sample with 8,1 million inhabitants has a value of 8,1, the smallest with 8000 inhabitants a value of 0,008. “Foreign” accounts for the filming of the show in locations outside of Germany, the value being 1 if it is outside and 0 if the show takes place in Germany. “Time” is the number of days since the first recording of the show divided by 1000. “time2_{/1000” is the square of “time”}

divided by 1000 5. The division factor of 1000 has been chosen in order to ensure better comparability between the coefficients in our analysis later on. Furthermore, we report the descriptive statistics on a range of individual characteristics such as gender, migration status, age, profession and hair color. Besides age, all individual characteristics are dummy variables. ”dissimilar” measures the social dissimilarity of the evaluator and the cook, based on the range of individual social characteristics, listed in table 6.1. The closer the value is to one, the less similar the two contestants are. In computational terms, “dissimilar” is the rescaled Mahalanobis distance between the vectors of two participants’ socioeconomic characteristics6_.

As a scaled Mahalanobis distance the variable dissimilarity is constructed from weighted sum of squared (and cross-products of) deviations in the considered socioeconomic characteristics. Hence it is meant to capture a potential effect of overall socioeconomic proximity between cook and evaluator. One may, however, question a

4_{The difficulty level and the price level are determined by experts on the website http://www.}

kochbar.de.

5_{The division factor of 1000 is chosen in order to ensure similar magnitudes of the explanatory}

variables; see above.

6_{The Mahalanobis distance (MDij) is defined} q_(x

i− xj)0V (x)−1(xi− xj) with xi and xj de-

noting the column vectors of socioeconomic characteristics of individuals i and j, respectively, and V(x) denoting the (estimated) variance-covariance matrix of socioeconomic characteristics x. The variable dissimilar is defined MDij/ max (MD). That is, dissimilar is normalized to one for the

most differential pair of individuals in the sample, while it takes the value of zero for a pair of individuals who share all considered socioeconomic characteristicshe Mahalanobis

general effect of dissimilarity and argue that it is rather dissimilarity with respect to specific characteristics that matters. Moreover, similarity in these relevant characteristics may not have a homogeneous effect on the depended variable. Dissimilarity with respect to age for example, might result in less complaisant evaluations while heterogeneousness with respect to gender might result in more positive evaluations. In order to address this argument, we also estimated specifications that separately include squared deviations in specific socioeconomic characteristics as explanatory variables. As most characteristics enter the model as indicators, these variables - with the exception of age – are dummies indicating a difference with respect the considered characteristic. Because of (near) collinearity, the data does not allow for estimating a model with full set squared deviations and we focus on a subset of characteristics (gender, immigration status, age, hair color), see Table 6.1.

Table 6.1: Descriptive Statistics

Mean S.D. Median Min Max

Points 7.616 1.363 8 1 10 cooking order Second 0.203 0.402 0 0 1 Third 0.202 0.401 0 0 1 Fourth 0.200 0.400 0 0 1 fifth 0.193 0.395 0 0 1

evaluator already cooked 0.500 0.500 0 0 1

number of ingredients 52.633 16.945 51 16 134

number of ingredients2/100 30.573 21.594 26.01 2.56 179.56

level 1.874.675 .3459764 2 1 3

price 1.859.798 .3616571 2 1 3

av. evaluation level

7.597 0.272 7.552 7.073 8.123

(previous 24 weeks) av. share viewers

9.470 1.564 8.332 7.846 12.234

(previous 24 weeks)

population (city of venue) 0.989 1.099 0.580 0.008 8.100

foreign (city of venue) 0.035 0.185 0 0 1

time 1.110 0.491 1.206 0.273 1.908

time2/1000 1.474 1.094 1.454 0.075 3.640

female 0.534 0.499 1 0 1 immigrant 0.074 0.263 0 0 1 age 38.284 11.177 38 18 71 student 0.070 0.255 0 0 1 civil servant 0.032 0.175 0 0 1 artist 0.071 0.256 0 0 1 entrepreneur 0.210 0.408 0 0 1 pensioner 0.006 0.077 0 0 1 employee 0.533 0.499 1 0 1 academic 0.089 0.285 0 0 1 trainee 0.009 0.096 0 0 1 pupil 0.009 0.092 0 0 1 blond 0.377 0.485 0 0 1 dissimilarity overall dissimilarity 0.344 0.162 0.337 0 1 (Mahalanobis dist.)

female (squared diff.) 0.593 0.491 1 0 1

immigrant (squared diff.) 0.133 0.340 0 0 1

age (squared diff.) 0.262 0.341 0.121 0 2.500

blond (squared diff.) 0.471 0.499 0 0 1

Table 6.2: Distribution of dependent variable (Points)

value 1 2 3 4 5 6 7 8 9 10

abs. frequency 1 2 17 53 152 450 1031 1073 643 313

cum. percentage 0.03 0.08 0.54 1.95 6.02 18.07 45.68 74.4 91.62 100

In document Essays on Technology Transfer, Energy Investment under Uncertainty, and Pro-Social Behavior. (Page 99-102)