Subsample Regression Analysis - Internal Validity

7.3 Internal Validity

7.3.3 Subsample Regression Analysis

The subsample regression analysis was performed to check for collinearity or time- dependence in the observations. If collinearities exist, these should result in large swings in the β weights for the inputs that are collinear. Since the simulation data set size is large (millions of observations), it is possible to run regressions on multiple random subsamples of observations to check for such instability. To check for time-dependence, a series of regressions was performed on evenly-sized time intervals of the full simulation data sets. The data for each time slice was drawn from the same time range across many runs. By subsampling over time, this analysis examined if the β weights appeared to have a trend through the simulation.

Detecting Collinearities: Random Subsample Regression Analysis

The random subsample analysis completed utilized the same form of regression displayed previously in Equation7.7. It also used the same data sets used earlier in Section 7.3.2, the Hamariyah Randomized Condition attention data and the Stanford Hypothesis Condition attention data. However, instead of computing a single regression each full data set, this analysis computed a number of regressions on subsamples of observations from each full data set. This was intended to expose unstableβ terms in the regression that might result from collinearity.

Each smaller regression consisted of 10,000 observations. These observations were sampled from the full data set under that condition, which consists of observations from numerous different runs and observations within those runs. The subsamples were generated independently, sampled from the full data set under that condition. As such, each of these random subsets is a limited but representative sample of the full set of observations (−→O) used to calculate the regression coefficients shown in Table7.12.

− → O_Sample₍_k₎=choose(−→O , k) (7.9) P(−→OSample(k)=x) = 1 N k (7.10)

Each subsample was obtained by using a Choose k algorithm, where each subsample consists of k observations sampled from the full set. In generating a single subsample, the observations are sampled without replacement (a subsample will not have any duplicates). Across samples, there is no restriction on re-using elements, all subsamples are generated starting with the full set of observations. So then, assume each subsample (−→O_Sample₍_k₎) consists of k elements from the full set (−→O) that consists of N elements. The function to select a subset can be stated as in Equation7.9, where the probability of any given subsample is noted in Equation 7.10.

Since these subsamples were random and independent, it is possible that the same observation might be included in one or more subsamples. However, such cases are unlikely due to the size of the subsample sets (10,000 observations) compared to the full data sets (2.5 to 5 million observations). Since these less powerful regressions should still have enough power to be representative, high instability in theβ values imply collinearity issues.

Regressions were applied to 100 subsamples, where each sample consisted of 10,000 observations. Table7.13displays the mean and variance of the regression β values for each of the attention salience inputs, for subsamples of the Stanford Hypothesis and Hamariyah Randomized conditions (same conditions used for the full simulation regressions). Looking at these regressions, it is evident that significant instability exists for all coefficients and major instability exists for

Table 7.13: Subsample Regression β Weights for Attention Stanford Hypothesis Hamariyah Randomized Salience Input Average β StDev β Average β StDev β

Authority 0.26 0.13 0.20 0.11 Conformity 0.09 0.13 0.88 0.06 InGroup 0.41 0.14 0.08 0.09 Motivation -0.23 0.27 -3.68 0.28 Novelty -0.01 0.29 -0.30 0.11 Reference Group -0.02 0.74 0.54 0.31 Selection 0.83 0.13 0.42 0.42 Similarity 0.63 0.26 0.42 0.08 Transferability 0.15 0.13 -0.43 0.05 Valence 0.79 0.16 0.56 0.08

some coefficients. From looking at these trends, the majority of β weights have standard deviations greater than 0.1 in both conditions. This indicates that collinearity is a problem for the regression, leading to unstable β weights. Additionally, the regression algorithm itself seems to tend toward assigning more extreme β weights rather than distributing weights more evenly. This is a documented issue with typical regression approaches in the presence of multicollinearity (Lipovetsky & Conklin, 2001). As such, even the seemingly stable weights may be improperly biased. Given that collinearity appears to be an issue, the relationships between the factors will be examined in Section7.3.4. Detecting Time Trends: Time-Interval Regression Analysis

To detect issues resulting from time-dependent interactions, the full data sets were also split into subsamples based on their simulation time step values. Unlike the prior analysis, these subsamples are deterministic and non-overlapping- no observation from one subsample could be present in another subsample. This analysis also used the same regression displayed previously in Equation7.7, as well as the data sets used earlier in Section7.3.2(Hamariyah Randomized Condition and Stanford Hypothesis Condition). A Mann-Kendall trend test was applied to theβ coefficients for each salience input, examining the trend of each coefficient across different periods of each simulation.

The subset samples for the time interval regression were created by splitting up the full set of observations based upon their simulation time step value (listed as “Step” in Table7.4). This value determines when the action occurred during the simulation. For each time step, multiple observations exist. This is because there are multiple runs, meaning the same time point occurs in each simulation. Additionally, many agents can observe the same action and determine if they pay

attention to it. This allows even a small slice of simulation time to have many associated observations.

The observations for each simulation condition was split into subsets that covered time intervals that approximately equal in time length. Each of these subsets consisted of data from multiple simulation runs that started using the same experimental condition. Assume −→Or[t] represents the observations for run rthat occurred at the Step value (t). If the time interval length is assumed to be Land there wereZ runs in that condition, the data will be split into subsamples as shown in Table 7.14.

Table 7.14: Time-Interval Subsample Approach

Step (t) Run 1 Run 2 ... Run Z

Subsample 1 1 −→O1[1] − → O2[1] ... − → OZ[1] .. . ... ... ... ... L −→O1[L] −→O2[L] ... −→OZ[L] Subsample 2 L+1 −→O1[L+ 1] −→O2[L+ 1] ... −→OZ[L+ 1] .. . ... ... ... ... 2L −→O1[2L] −→O2[2L] ... −→OZ[2L] .. . ... ... ... ... ... .. . ... ... ... ... ...

A logistic regression, in the same form as shown in Equation7.3, was applied to each of these subsets to estimate theβ coefficients for each input to salience. For each input to salience, the associatedβ coefficients were considered as a time series based upon the time period for the subset. For example, the Hamariyah regressions assigned Ingroups a β value of 0.32 the step 1 to L subset, a β of 0.26 for step L+1 to 2L subset, and so on. If the distribution of observations is time-invariant, there should be no significant trends in the β coefficients for each of the inputs to attention. To test for this, a Mann-Kendall trend test was applied to each β time series to determine the association (trend direction) and the probability of the null hypothesis (no trend).

For the Hamariyah Randomized data set, the data was split into slices 72 steps in length (L = 72). This length was chosen because it was the number of steps required for each agent to take one action. This created 48 non-overlapping subsets, each with approximately 100,000 observations. Table 7.15 displays the Mann-Kendall trend analysis of the regression β values calculated for each time interval. In this table the Tau coefficient displays the direction and strength of the trend, while the p value indicates the probability of the null hypothesis (no trend). From looking at this table, it appears there are potentially significant

trends occurring over time. Motivation, novelty, and valence appear to experience a negative trend in their β values over time, making them less indicative of attention. All the other factors appear to have increasing coefficients, except for InGroups, which appear to have a stable influence. This indicates that the collinearities between inputs experience some time-dependent trends. The next section, which explores correlations between factors, examines why some of these trends might occur.

Table 7.15: Interval Regression β Weights for Attention (Hamariyah Randomized)

Salience Input Tau Coefficient p

Authority 0.17 0.09 Conformity 0.19 0.06 InGroup 0.10 0.34 Motivation -0.56 2·10−8 Novelty -0.17 0.10 Reference Group 0.17 0.08 Selection 0.27 0.01 Similarity 0.17 0.10 Transferability 0.20 0.04 Valence -0.28 0.01

The Stanford Hypothesis condition data was also analyzed using a similar technique. Each time slice consisted of 12 steps, creating 58 subsets which each contained approximately 50,000 samples. However, due to the changing guard shifts, it is harder to interpret theβ coefficients. Since a change in guard shifts may change some of the collinearities, there will be a degree of periodicity in theβ coefficients. In order to accommodate this, a Seasonal-Kendall trend test was applied. This test works similarly to the Mann-Kendall test, except that it preprocesses each season by finding the median and then looks for trends across the seasons. For this analysis the guard shifts were treated as “seasons” of the day, so that a 24 hour day consisted of 3 seasons (guard shifts). While this technique loses some temporal information, it is necessary to accommodate the periodic nature of guard shifts. The trend test implementation used was developed by the US Geological Survey (USGS) and its details are described in Helsel, Mueller, Slack, and Geological Survey (US)(2006).

Table7.16displays the trends in theβcoefficients for the Stanford Hypothesis condition attention components. The Stanford Prison experiment appeared to have less clear time-dependent trends, possibly due to the periodicity of the guard shifts. However, it still showed a few strong trends in the β values over time. Valence and reference group influence appeared to have less power for attention over time, while similarity appeared to increase itsβ values significantly. Overall,

Table 7.16: Interval Regression β Weights for Attention (Stanford Hypothesis) Salience Input Tau Coefficient p

Authority -0.13 0.67 Conformity -0.07 0.89 InGroup 0.07 0.89 Motivation -0.20 0.48 Novelty -0.20 0.48 Reference Group -0.53 0.03 Selection 0.40 0.12 Similarity 0.87 0.01 Transferability 0.27 0.32 Valence -0.80 0.01

this data is harder to draw clear conclusions from but it indicates that some time- dependent changes in collinearity occurred in this simulation as well.

Looking at the Tau correlation coefficients from both experiments, a few trends seem to emerge. Firstly, the InGroup β values appear to be fairly consistent with respect to time- neither increasing or decreasing in importance. This indicates that any collinearity involved with these does not have a time component. The motivation, novelty, and valence terms had negative correlation coefficients in both conditions, especially in the Hamariyah condition which was far less noisy. This indicates that for some reason other factors may to increasingly overshadow these coefficients over time (i.e. other, collinear inputs to attention get higherβ weights while theirs decrease). The factors that increase appear vary between the simulations, indicating that negative feedback may occur for certain factors. Negative feedback could occur due to the dynamics between factors. It could also occur due to feedback between a factor and attention. The correlation analysis in the next section will attempt to examine what relationships might exist that cause these trends in theβ values.

In document Modeling Memes: A Memetic View of Affordance Learning (Page 159-164)