Basic Concepts and Equations
Equations 4.13a and 4.13b, represent the hydrostatic pressure distribution and applies to open-channel flows unless the water surface curves very sharply in the
2. Formulate the regression model: The standard model is a linear additive model, which has the form
ˆY = b0+ b1· X1+ b2· X2+ ··· + bP· XP, (4.65) where b0 . . . bPare regression coefficients (b0 is often called the regression constant). However, in hydraulics the most common model is the linear multiplicative model:
ˆY = c0· X1c1· X2c2· ...XPcP (4.66) Although the choice of additive or multiplicative model is up to the scientist, the regression process is identical in both, because equation 4.66 can be put in the form of 4.65 via a logarithmic transform:
log Y= logc0+ c1· logX1+ c2· logX2+ ··· + cP· logXP (4.67) Note that the “hat” notation in equations 4.65 and 4.67 denotes an estimate of the average value of the dependent variable Y or log Y associated with a particular set of xjivalues. This estimate is subject to uncertainty because 1) the model is always imperfect, and 2) the coefficients are derived for a specific set of data.
3. Collect the data: These are N measured values (observations) of the dependent and independent variables, yi, x1i, x2i, . . .,xPi, which must be associated in space or time.
4. Determine the values of the coefficients: The mathematics of ordinary regression analysis provide estimates of b0. . .bpor c0. . .cPthat “best fit” the observations
BASIC CONCEPTS AND EQUATIONS 171 in the sense that, for the data used, the coefficients minimize
N where the yiare the actual measured values of the dependent variable and the ˆyiorlog yi are the values estimated by the regression equation (equation 4.65 or 4.67), and there are N sets of measured values (i= 1,2,...,N).
From these steps, it is clear that regression equations differ fundamentally from equations based on the laws of physics:
• The P variables included in an empirical equation are determined by the scientist, not by nature.
• The form of an empirical equation is determined by the scientist, not by nature.
• The numerical coefficients and exponents in an empirical equation are determined by the particular set of data analyzed (the N sets of y and xjvalues) and, in general, are not universal.
• The relationships resulting from statistical analysis reflect association among variables, but not necessarily causation.
Because of these characteristics, uncertainty is an inherent aspect of regression analysis. There are some additional critical differences between regression equations and those derived from basic principles. One that is often overlooked is that ordinary regression equations are not invertible. To understand this, suppose we analyze a set of data and produce a regression equation
ˆY = b0+ b1· X1. (4.69)
If this were a purely mathematical relation, we would consider that ˆY = Y, and it would be true that
X1= −b0
b1+ 1
b1· Y. (4.70)
However, if we use the same data to do an ordinary regression with X1as the dependent variable and Y as the predictor variable, the constant will not be equal to (−b0/b1) and the coefficient will not be equal to (1/b1).9
A final fundamental difference between empirical equations and those derived from basic physics is that, in general, empirical equations are not dimensionally homogeneous. As explained in appendix A, this means that the coefficients estimated by the regression analysis must be changed for use in different measurement systems (e.g., British and SI).
4.8.3.2 Empirical Equations Based on Dimensional Analysis
The use of dimensional analysis to reduce a problem involving a large number of physical variables to one involving a smaller number of dimensionless quantities is described in section 4.8.2. Once the dimensional analysis is completed, the nature of the functional relationships among the dimensionless quantities is explored using
172 FLUVIAL HYDRAULICS
observational data from laboratory experiments or field observations. Regression analysis can be a useful tool in this exploration.
Applying linear regression models to dimensionless quantities, we can write the analogs of equations 4.65 and 4.66, respectively, as
ˆY = b0+ b1· 1+ b2· 2+ ··· + bP· P, (4.71) and
ˆY= c0· 1c1· 2c2· ...PcP, (4.72) where one of the pi terms has been selected as the dependent variable and designated Y.
Whichever model we choose, all the quantities are dimensionless, so in addition to simplifying the problem, we avoid having to worry about changing equations for use with different unit systems.
To illustrate this approach, we return to the dimensional analysis example in section 4.8.2.1. We focus on equation 4.64 and plot U/(Y·g·sin S)1/2versus Y /yrfor 29 stream reaches in New Zealand in figure 4.14 using data provided by Hicks and Mason (1991). Note that both axes of that plot are logarithmic, and the distribution of plotted points suggests that one could approximate the relation by an upward-sloping straight line for Y /yr≤ 10. Thus, we select the multiplicative (logarithmic) model (equation 4.66 with P= 1), and the regression analysis yields
U relationship can be approximated as simply the average value of U/(Y·g·sin S)1/2= 9.51. Thus, the dimensional analysis combined with the measured data suggests the following model for predicting velocity:
U= 1.84 · Equation 4.74 is clearly an approximation, as there is much scatter about the line.
Plotting the same data but identifying the points associated with each individual reach (figure 4.15) shows that the general form of the relation applies, but that the relationship is shifted from reach to reach. This pattern suggests that other factors that vary from reach to reach, perhaps including the pi terms W /Y and Re or other factors not included in the dimensional analysis, also affect velocity. Thus, we might conduct further analyses to explore approaches to reducing the scatter, focusing on 1) accounting for the effects of the other pi terms identified in the dimensional analysis, and 2) looking for factors not included in the original dimensional analysis that might affect the relationship, such as the presence of vegetation or channel curvature.
However, the dimensional analysis combined with measured data have clearly been a useful first step, and we can conclude that many important hydraulic relationships can be developed by empirical analysis of the relations between dimensionless variables identified via dimensional analysis. We will encounter several examples of this approach in subsequent chapters.
BASIC CONCEPTS AND EQUATIONS 173
0.1 1 10 100
U/(g⋅Y⋅sinθS)1/2
0.1 1 10 100 1000 10000
Y/yr
Figure 4.15 U/(g· Y · sin S)1/2 versus Y /yr for 29 New Zealand stream reaches, where yr= d84. Flows from each reach are identified by a different symbol. Data from Hicks and Mason (1991).
4.8.4 Heuristic Equations
“Heuristic” means “helping to discover or learn; guiding or furthering investigation.”
A heuristic equation is one that, though not derived from basic physics or based on statistical analysis of observations, seems physically plausible and is generally consistent with observations. Hydrologists often invoke heuristic equations as conceptual models of complex processes when it is not practicable to develop detailed physically based representations or to collect all the data that would be necessary as input for such representations.
Probably the most common heuristic equation is the simple model of a hydrological or hydraulic reservoir as
Q= aR· VbR, (4.75)
where Q is the rate of output [L3T−1] from the reservoir (which might be a lake, a segment of a river channel, an aquifer, or a watershed), V is the volume of water [L3] stored in the reservoir, and aRand bRare selected to best represent the particular situation.
In many situations, the exponent is assigned a value bR= 1, and equation 4.75 then represents a linear reservoir. In this case, aR has the dimensions [T−1] and is equal to the inverse of the residence time of the reservoir, which is the average length of time an element of water spends in the reservoir (see Dingman 2002). Although the linear reservoir model does not strictly represent the way most natural hydraulic
174 FLUVIAL HYDRAULICS
and hydrological reservoirs work, it does capture many of the essential aspects and is mathematically (and dimensionally) tractable.
We will incorporate the linear reservoir model in a simplified approach to predicting how flood waves move through stream channels in chapter11, and you will probably encounter heuristic equations in other hydrological and hydraulic contexts.