Linear regression analysis with a single independent variable may be performed with a graphical display and use of the trendline and R2 features discussed in Section 3.7 and Section 3.8.
Excel may also be used to perform a least-squares regression calculation for a linear function with multiple independent variables. Up to 16 variables may be accommodated.
Consider the variable y expressed as the linear function
y = m1x1 + m2x2 + … + mnxn + b (6.21) which becomes the familiar linear equation for a straight line
y = mx + b (6.22)
when only a single independent variable is involved. Excel executes the least-squares linear regression calculation through use of the worksheet function LINEST, which requires the following syntax:
LINEST(y-values, xn-values, constant(true/false), statistics(true/false)) (6.23) where
n is the number of independent variables in Equation 6.21
y-values is a known array of numerical y-values that may equal or exceed n + 1
xn-values is a known array of numerical values for the independent variables that may equal or exceed n + 1 for each variable
constant(true/false) is a logical statement; TRUE is entered if the regression analysis is to include a determination of the constant b in Equation 6.21; FALSE is entered if the value of b is to be taken as zero
statistics(true/false) is a logical statement; TRUE returns a calculation of certain statisti-cal functions as described in the following text; entry of FALSE returns no statistical calculations
nation Ctrl+Shift+Enter pressed. The format of the results array is:
Thus, an array of 5 rows and n + 1 columns should be selected for the presentation of the results. No other information may appear in this block of cells.
The m and b values displayed are used in Equation 6.21, R2 is the coefficient of deter-mination described in Section 3.8, the se terms are the standard errors for the various quantities, ssreg is the regression sum of squares, ssresidual is the residual sum of squares, df is the number of degrees of freedom, and F is a statistical function. The reader should consult Help/Index under the LINEST function and “standard errors” for a description of the standard error and statistical functions. Our primary concern here is for the proce-dure to determine the mn and b constants.
Example 6.12: Linear Regression for Three Variables
Numerical values for y as a function of the three variables x1, x2, and x3 are shown in the worksheet of Figure 6.17a, with eight numerical values for each variable. A block of cells for the output results of the LINEST function is selected as A12:D16 in Figure 6.17b.
The LINEST function is then typed in cell A12 as shown, in accordance with the syntax of Equation 6.23. The array containing the numerical y-values is D2:D9, and the array containing all of the numerical xn values is A2:C9. Entry of the first TRUE calls for a determination of the constant b, whereas entry of the second TRUE calls for display of the statistical results. Ctrl+Shift+Enter is then pressed to execute the function.
The results are shown in Figure 6.17c along with the original numerical data. In accor-dance with the output format described in Section 6.8 we have:
m1 = 3.247873 m2 = 2.041483 m3 = 1.241881 b = −3.86409 R2 = 0.991044 and the final regression formula for y is
y = 3.247873x1 + 2.041483x2 + 1.241881x3 – 3.86409 (a)
Row
1 mn mn-1 º m1 b
2 sen sen-1 º se1 seb
3 R2 sey
4 F df
5 ssreg ssresidual
Other Operations 147
Note that the values of the mn appear in reverse order to the columns for the numerical values of the xn.
It is interesting to compare Equation (a) with the original numerical data. For the x-values in row 2 we may calculate:
y = (3.247873)(1) + (2.041483)(3.2) + (1.241881)(3) – 3.86409
= 9.642172 as compared to a data value of 9.5 in cell D2 For the x-values in row 9 we obtain:
y = (3.247873)(8) + (2.041483)(9) + (1.241881)(10) – 3.86409
= 52.911 as compared to a data value of 53 in cell D9 The value of R2 = 0.991 indicates a fairly good regression result.
Example 6.13: Modeling Performance of an Air-Conditioning Unit
The cooling performance of a commercial air-conditioning unit depends on the indoor and outdoor temperatures. The appropriate indoor temperature that governs the cooling process is the wet bulb temperature, which we designate as Tew (°F). The outdoor
FIGURE 6.17
temperature that governs is the dry bulb temperature that enters the conditioning unit, which we designate as Tc(°F). The cooling performance of the unit is stated in units of thousands of Btu/h and is designated as Qew.
The left portion of Figure 6.18a gives the performance data for an actual air-conditioning unit as specified by the manufacturer. These data are based on actual measurements of the unit’s cooling capacity and fitted to standard temperature ranges agreed to by man-ufacturers. Although these data are useful as they stand for unit-sizing activities, an analytical expression of the data may be desired for modeling studies in conjunction with performance of other devices associated with a particular application, such as pumps and fans. The LINEST worksheet function is a suitable way to obtain such a relation.
The dependent variable or “y-function” is Qew, whereas the independent variables are Tew and Tc. A group of cells (A21:C25) is activated as shown in the right inset of Figure 6.18a, and the LINEST function is entered in cell A21. The array for the y-variable is C4:C19, whereas the array containing the x-variables is A4:B19. TRUE is entered to obtain a fit with a constant value b, with TRUE also entered to obtain values of the statistical parameters. Ctrl+Shift+Enter is pressed, and the regression parameters are listed in cells A21:C21 with the values
mTew = 0.526
Other Operations 149
b = 34.723
The resulting linear regression formula to be used for calculating Qew is therefore Qew= 0.526Tew – 0.228Tc + 34.723 (6.24)
with R2 = 0.943
Values calculated from this formula may be compared with the original data by calculating a two-variable table using the DATA/TABLE command described in Section 2.16. The worksheet for this calculation is shown in Figure 6.18b. An input cell of E1 is chosen for Tc, whereas cell E2 is chosen for Tew. The formula corresponding to Equation 6.24 is entered in cell E5 as shown in the insert at the bottom of the figure. Values of Tc
are entered in row 5 to the right of the formula, whereas values of Tew are entered in
FIGURE 6.18 (continued)
Tc
34.723 85 95 105 115
72 53.215 50.935 48.655 46.375
Tew 67 50.585 48.305 46.025 43.745
62 47.955 45.675 43.395 41.115 57 45.325 43.045 40.765 38.485
Comparison of Linear Regression and Actual Capacity of A/C Unit
35
tabular values for Qew appear as shown at the top of Figure 6.18b. These values may be compared with those in the original data of Figure 6.18a. Although they do not agree exactly, the values are probably close enough for modeling purposes. A graphical com-parison of the original data and the regression calculations is given in Figure 6.18c. The solid lines represent the original data, whereas the dashed lines represent the regression results. We may note that the original data table almost certainly involves curve-fitting to the original experimental data.