Identifying Potential Predictors - Linear Regression Using R: An Introduction to Data Modeling

Normally. multidimensional scaling is used to provide a visual representation of a complex set of relationship that can be scanned at a glance. Since maps on paper are

two-dimensional objects; this translates technically to finding an optimal configuration of points in 2-dimensional space. However, the best possible configuration in two dimensions may be a very poor, highly distorted, representation of the data. If so, this will be reflected in a high tress value. When this happened, there are two alternatives, that is, the multidimensional scaling is abandoned as a method of representing data or the number of dimensions is increased. But then, there are two difficulties with inc~easing the number of dimensions. The first is that even 3 dimensions are difficult to display on paper and are significantly more difficult to comprehend. Four or more dimensions renders multidimensional scaling useless as a method of making complex data more accessible to the human mind.

The second problem is that, with increasing dimensions, the number of parameters must be increased to obtain a decreasing improvement in stress. On the other hand, there are some applications of multidimensional scaling for which high dimensionality is not a problem. For instance, multidimensional scaling can be viewed as a mathematical operation that converts an item-by-item matrix into an item-by-variable matrix. Suppose there is a person-by-person matrix of similarities in attitudes. But would like to explain the pattern of similarities in terms of simple personal characteristics such as age, sex, income and education. The problem then is that these two kinds of data are not conformable.

However, if the data is runned through multidimensional scaling (using very high dimensionality in order to achieve perfect stress), a person-by-dimension matrix can

be created which is similar to the person-by-demographic matrix that we arc trying to compare it to. (Young and Harmer. 199-1).

2.13 Shepa rd Diagram

The Shepard diagram is a scatter plot of input proximities (both ^X;j and

f(-'",;)

against

output distances for every pair of items scaled. Normally. the X-axis corresponds to the input proximities 'and the Y-axis corresponds to both the MDS distances £I,;and the transformed (fitted) input proximities f(X^ii). If the input proximities are similarities.

the points should form a loose line from top left to bottom right (negative slope). but if the proximitics arc dissimilarities. the data would form a line from bottom left to top right (positive slope), (Shepard. 1962).

2.14 Interpretation

There are two important things to realize about multidimensional scaling map, The first is that the axes are in themselves meaningless and the second is that the orientation or the picture is arbitrary, Thus multidimensional scaling representation

or

distances between objects need not be oriented such that north is up. south is down.

cast is right and west is left. In fact. north might be diagonally down to the left and east diagonally up to the left. All that matters in multidimensional scaling is the position of points.

When looking at a map that has much strain, it must be kept in mind that the distances among items are, distorted, representation of the relationship given by the data at

hand. In general, however, larger distances can be relied as being accurate. This is because the strain function accentuates discrepancies in the larger distances, and the multidimensional scaling program therefore tries harder to get this rights.

There are two things to look for in interpreting multidimensional scaling picture:

cluster and dimensions. Clusters are groups of items that are closer to each other than other items. For instance in multidimensional scaling map of perceived similarities among animals, it is typical to find that the domestic ariimals such as cow, horse and pig are all very near to each other, forming a cluster. Similarly, the zoo animals like lion, tiger, antelope, monkey, elephant and giraffe form a cluster. When really tight, highly separated clusters occur in perceptual data, it may suggest that each cluster is a domain or subdomain of the data. It is especially important to realize that any relationship observed within such a cluster, such as items 'a' being slightly closer to item 'b' than 'c' should not be trusted because the exact placement of items within a tight cluster has little, effect on the overall stress and so may be arbitrary, (Borg and Groeman., 1997).

Dimensions are item attributes that seem to order items in map along a continuum. For instance maps on papers are normally designed on 2-diemsnional (i.e. X - axis and Y-axis) which is a mathematical dimension. Here the substantive dimensions or attributes need not correspond in numbers or direction to mathematical dimensions (axes) that define the vector space (multidimensional scaling map). For instance, the

larger than the number of mathematical dimensions needed to reproduce the observed pattern.

This ^IS because the mathematical dimensions are necessarily orthogonal (perpendicular). In contrast, the human dimensions, while cognitive distinct, may be highly intercorrelated and therefore contain some redundant information.

One thing to keep in mind in looking for dimension is that one person's respondents may not have the same view as that person may do. For instance, for one thing, the respondent may be reacting to attributes, the person have not thought of. Even if the person and the respondents are using the same set of attribute, the respondents may assign different scores on each attribute the person do. For instance, one of the attributes might be "attractiveness" then the person's view of what an attractive person, fruit or other item may be different from his respondents, (Kruskal and Wish,

1977)

In document Linear Regression Using R: An Introduction to Data Modeling (Page 38-41)