Connection with count-based measures of the disposition effect

The previous section used a Cox model to estimate the difference in the rate at which gains and losses are sold i.e. the hazard ratio for the paper gain indicator. The ratio PGR/PLR, introduced in section 1.4 and used through- out section 1.5, has a qualitatively similar interpretation. This section will formalise the relationship between the two quantities.

The hazard ratio can also be estimated non-parametrically using a formula proposed in Peto and Peto (1972)

d HR= OG EG OL EL (1.8)

whereOG and OL are the total number of sales observed for a gain and loss

respectively. EG is the number of sales for a gain that would be expected if

there was no difference between the rate at which gains and losses are sold, and is defined by EG = k X i=1 mi Ri NGi

where i= 1, ..., k are the unique event times,mi is the number of sales that

occurred at timei, NGi is the number of positions trading for a gain at time i and Ri is the total number of positions that were at risk of being sold at

time i.18 EL is defined similarly. The ’observed’ and ’expected’ quantities, O and E, are the same as those from the log-rank statistic for testing the equality of two survival curves. Note that for simplicity it is being assumed 18_{For a large dataset such as this, there is at least one event on each day of the follow}

up period, so sums over the unique event times are effectively sums over the days of the follow up period.

that a position must be either trading for a gain or a loss at all times during its holding period.

From the definition of the terms in formulas (1.1) and (1.2), PGR and PLR can be re-expressed using the variables from (1.8)

P GR= OG NG , P LR= OL NL where NG = k X i=1 NGi

and likewise forNL. Now the Odean ratio from (1.3) can be written as

DE = OG NG OL NL (1.9)

Dividing the hazard ratio estimator in (1.8) by this quantity produces a scaling factor that does not depend on the observed countsOGandOL

d HR DE = OGELOLNG OLEGOGNL = ELNG EGNL (1.10)

This implies that HR > DEd if

ELNG EGNL >1 or equivalently EL NL > EG NG

Meaning the hazard ratio will be greater in magnitude than the modified Odean score if the ratio of the expected number of sales to the total number of positions in the risk set (summed across the unique event times) is greater for losses than it is for gains.

To demonstrate this relationship, the scaling factor derived in (1.10) can be calculated for the LDB dataset and applied to the DE score produced using the sample of positions collected as described in section 1.7.1. Due to the derivation above using the restriction that a position must be in either the paper gain or paper loss category at all times, intervals where a position is in the paper neutral category are recoded as paper losses. Estimating a Cox model produces a hazard ratio for the gain indicator of 1.78. The DE score for this sample is 1.57, and the scaling factor is 1.12. This produces an estimate of the hazard ratio of 1.76. The remaining discrepancy is likely due to the downward bias of HRd in large samples with tied event times, as

discussed in Bernstein et al. (1981).19

19_{Note that}

HR is equivalent to their O/E estimator since for this dataset all tables are informative i.e. there are gain and loss positions at risk at all time points.

Chapter 2 Using a frailty model to

measure the effect of covariates

on the disposition effect

2.1 Introduction

An important aim in the DE literature has been to determine which factors either strengthen or weaken the effect, particularly the characteristics of the investors themselves. Starting with Feng and Seasholes (2005), the use of a proportional hazards regression model, introduced in section 1.6, has been a common approach to this problem. In the context of survival analysis, an important property of trading record datasets is the natural grouping of trades at the investor level. Positions held by the same investor are likely to be dependent due the particular investment style of the investor, and this unobserved heterogeneity needs to be addressed when estimating a regression model. The problem of investor-level dependency is made more important by the extreme imbalance that can be present in datasets of this kind. A small number of investors account for a large proportion of overall

trading activity, hence effect estimates will disproportionately reflect the idiosyncrasies of these investors if no effort is made to control for them.

In the existing literature the marginal method is used, which adjusts stan- dard errors after the model has been estimated to account for the investor- level correlation between positions. This chapter will explore the use of frailty models as an alternative. The analysis will again make use of the LDB dataset, which was introduced in section 1.2. In a frailty model, the propen- sity to sell a position that is unique to each investor is modelled directly in a way that is analogous to the use of random effects in linear regression. For the dataset used here, the addition of a frailty component significantly improves upon the equivalent marginal model and allows a greater number of effects to be significantly estimated. In addition, the frailty model adheres much more closely to the proportional hazards assumption, which makes the parameter estimates more useful as summaries of the effect over time.

Results from the frailty model provide some new evidence on experience and learning; the number of trades an investor has made does not have a significant effect on the DE when the investor’s self-assessed experience level is included in the model, and the length of time an investor has held an account does not appear to be a reliable measure of experience in this dataset, as those who opened an account most recently exhibit the weakest DE. Graphical checking of the proportional hazards assumption adds nuance to the interpretation of some variables. For example, the weakening of the DE in December is much larger for positions that have already been held for a long period of time, and differences in the DE between positions in small and large cap-size stocks only start to materialize after they have been held for roughly 100 days.

The remainder of the chapter is organised as follows. Section 2.2 dis- cusses the problem of investor-level correlation in detail, along with both the marginal and frailty approaches to dealing with with. Section 2.3 describes the covariates that will be used for model estimation. Section 2.5 presents

results from estimating both a frailty and marginal model, and highlights how they differ. Section 2.6 presents some additional models that check the robustness of the main results. Section 2.7 summarises the advantages that using a frailty model provided for this analysis and compares the results to those in the literature.

In document Measuring and modelling patterns of behaviour in datasets of individual investor trading records using flexible methods (Page 68-73)