• No results found

Using absolute deviations to compute lines of best fit

N/A
N/A
Protected

Academic year: 2020

Share "Using absolute deviations to compute lines of best fit"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Using Absolute Deviations to Compute

Lines of Best Fit

J. P. H O U C K R. D . H U N T

TH E R E is some renewed interest among agricultural economists and others

in the old technique o f using minimized absolute deviations in computing lines of best fit. H . B . Jones and J. C. Thompson, in a recent article in Agricultural Economics Research, discussed philosophical and practical questions raised when fitted lines using squared and unsquared deviations are compared[5j. However, economic literature (including the Jones-Thompson paper) as well as elementary statistics and econometric literature seems to contain no concise description o f the straight-forward and simple procedure for calculating straight lines o f best fit which minimize the sum of absolute deviations from the empirical observations to the fitted line. In some work at the University of Minnesota, we find that lines fitted by minimizing the sum o f absolute deviations can be useful in certain

circumstances. • Our purpose is to present, without proving, a simple step-by-step method for

calculating such lines o f best fit for the two-variable case. A n illustrative example is appended. Those wishing to investigate the historical development and mathe­ matical basis for these methods are directed to the References section, especially [2] [3] M [7] M- The article by Kaarst is particularly helpful.

(2)

Consider a set o f n observations on Y and X to which a linear function is to be fitted. The fitted function is to have the form

U Y='d+bX>

where a and b are parameter estimates.

Suppose that, instead o f an ordinary least squares line o f best fit, a MSAD line is desired.

T w o versions o f this line can be computed [6]. The first is the M S A D line constrained to pass through a pre-selected point such as the means, the medians, or the origin. The second is the unconstrained M S A D line. Unlike the ordinary least squares line, the unconstrained M S A D straight line does not, in general, pass through the point o f means.

The Constrained MSAD Line

1. Select the reference point (Y0, X0) through which the M S A D line is to pass. For each o f the n observations, compute

y i = ( Y - Y 0 )

i ' = 1,2,3, . . . n

• • ; . ' . - x o ) .

2. Calculate the n ratios, y,-/*;. ' Then rank these ratios in ascending algebraic order, beginning with largest negative or the smallest positive number.

3. Sum the absolute values o f x{; Zn\xi\ •

4. To the value—En \ xt | add the absolute value 2 | x{ | from the ratio having the lowest rank. Then add the absolute value 2 | x{ | from the ratio with ' t h e next lowest rank. Continue until'the algebraic sum changes sign from

. negative to positive.. '; V ' ' 1

5. Note the actual value o f the ratio (yk/xk) at which the sign change occurs. This rations the'slope,&,'of the constrained M S A D line. This line w i l l pass J. through' both the pre-selected point 'and the point at which the ratio

„(xA/yA) is computed. Having calculated b then

(3)

When S is plotted on the vertical axis and b is plotted on the horizontal axis the result is an open polygon, convex downward [6]. The minimum point on this polygon is located directly above the point b = xkJyk, where the slope -o f S switches fr-om negative t-o p-ositive.1

The Unconstrained MSAD Line

Finding the unconstrained MSAD straight line o f best fit involves a simple iterative procedure based on the method described above for the constrained line. The procedure is:

1. Compute a constrained MSAD line, but select one o f the data points as

the initial reference point (fewer iterations are needed i f the selected point is on or near a freehand line o f best fit.)

2. This constrained line w i l l pass through at least one other data point in the

sample. Select this point as the next reference point, and compute another.' M S A D line.

3. This second line w i l l pass through still another data point. Using this point

' as the reference, re-compute.

4. Continue until the fitted line reflects back through a data point already used. This line is then the unconstrained M S A D line o f best fit. By judicious selection o f the initial reference point on or near a freehand trend line, only one or two iterations usually w i l l be needed.

When the constraint that the MSAD straight line must pass through a pre­ selected point is removed, then two parameters, a and b, must be estimated directly. Then, in this case

This is an open polyhedron in three dimensions [6]. The iterative procedure de­ scribed above is then a systematic search for the minimum o f S using various traces o f S, each associated with a given value2 o f a.

Goodness of Fit ' • ' '

Measuring the goodness o f fit o f an MSAD line is not as straight-forward as with the least squares technique. This is because the total sum o f absolute devia­ tions o f Yj about a point, say the mean or median, .cannot be partitioned',un­ ambiguously into that portion accounted for by the fitted line and that portion

1. I f the polygon S has a flat horizontal base rather than-a single m i n i m u m point, then b is indeterminant between the values at the ends o f the base. • • •

(4)

not accounted for by the fitted line, as can be done with squared deviations about the sample mean in least squares. However, we suggest the following coefficient

as a measure of how well an MSAD line fits the sample observations on Y(.

f = i - Z „ \ Y , . - Y , . | / i 7n| Y t - Y \

Where % are values o f Y along the fitted MSAD line associated with the sample

X{ and Y is the median value o f Y.

.The median is selected as the reference point since, for any given set o f num­ bers, the sum of the absolute deviations is smaller when measured from the median than from any other number [9]. The coefficient, / , can range between zero and + i-o. I f there is no systematic association between X and Y in the data, then the minimizing o f absolute deviations w i l l yield Ys= Y a n d / w i l l be zero. On the other hand, i f the fit is perfect and'Y) = Y{, t h e n / w i l l be + i-o. Intermediate values o f / w i l l then indicate how well the MSAD line fits the data relative to a scale o f zero to + i-o. As a criterion forjudging goodness o f fit, the coefficient / probably should not be compared with the least squares r2 for the same set o f data. Its use should be confined to comparisons o f MSAD lines of best fit. For example, the coefficient / can be used to compare constrained vs unconstrained M S A D lines or constrained MSAD lines fitted through various pre-selected points.

An example

Consider the following set o f five observations on X and Y respectively =

(1,1), (2,4), (3,5), (4,6) and (5,7). I f a constrained MSAD straight line is fitted

through the point (1,1), its slope, b, is 1-67 and its Y-axis intercept, a, is — 0-67.

The/coefficient is -67. However, when the unconstrained procedure is used, the slope is + i-oo and the Y-axis intercept is + 2-00. The/coefficient increases to -75. The reader may wish to verify the results o f this illustrative example using the procedures outlined previously.

In conclusion

- Extending these methods beyond two variables is not easy though it has been done mathematically [7] [8]. The multiple variable problem may be reduced to a linear programming exercise [4]. This use is demonstrated by Arrow and Hoffenberg in an interindustry demand study [ i ] , b u t beyond this the use o f this technique in economic research does not appear to be widespread.

r ' . References

' 1. A r r o w , K . J . and Hoffenberg, M . , ^ l Time Series Analysis of Interindustry Demands. Amsterdam: N o r t h H o l l a n d Publishing C o m p a n y , 1959, pp. 55—57.

2. Ashar, V . G . and T . D . Wallace, " A Sampling Study o f M i n i m u m Absolute Deviations Estimators," Operations ^Research, N o . 5, V o l . 1, September-October 1963, pp. 747-758.

(5)

4. Fisher, W . D . , " A N o t e on C u r v e Fitting w i t h M i n i m u m Deviations b y Linear P r o g r a m ­ m i n g , " Journal of American Statistical Association, V o l . 56, 1961, pp. 359-362.

5. Jones, Harold. B . , and Jack C . T h o m p s o n , "Squared versus Unsquared Deviations for Lines o f Best F i t , " Agricultural Economics Research, V o l . 20, N o . 2, A p r i l 1968, pp. 64-69.

6. Kaarst, O t t o J . , "Linear C u r v e Fitting U s i n g Least Deviations," Journal of the American Statistical Association, N o . 281, V o l . 53, M a r c h 1958, pp. 118-132.

7. Rhodes, E . C , "Reducing Observations b y T h e M e t h o d o f M i n i m u m Deviations," Philosophical Magazine, Series 7, V o l . 9, N o . 60, M a y 1930.

8. Singleton, R . R . , " A Method for Minimizing T h e S u m o f Absolute Values o f Deviations," Annals of Mathematical Statistics, V o l . n , 1940, pp. 301-305.

References

Related documents

(2) Setiap Orang yang dengan tanpa hak dan/atau tanpa izin Pencipta atau pemegang Hak Cipta melakukan pelanggaran hak ekonomi Pencipta sebagaimana dimaksud dalam Pasal 9 ayat

Since 2001, Corporate Social Responsibility (CSR) is a policy topic in European Union (EU), therefore business sector contribution to sustainable development (SD) is getting more

The topics included in such classes include web programming; SQL; text/data mining (NLP); social computing; development/deployment for the cloud; journalistic practices in the

Check reference electrode filling solution and refill if necessary Clean or replace filter element 2 Months Replace reagent, diffusion tubing,4.

Some countries and the European Commission have followed “unconventional” approaches to this issue in the case of wholesale broadband access markets: While some disregarded the

Further measures could additionally improve the access to investment means. In this respect, tax credit related measures would mainly benefit the owner-occupied, rented housing

FO B cells enter secondary lymphoid organs (spleen, lymph nodes, mucosal lymphoid tissue) through blood vessels located in the T cell areas and then migrate into the follicle,

Move into iOS development by getting a firm grasp of its fundamentals, including the Xcode IDE, the Cocoa Touch framework, and Swift 2.0—the latest version of Apple’s