Data analysis 2 - Materials and methods - Dry matter and nitrogen partitioning in sweet corn (Z

2.2 Materials and methods

2.2.6 Data analysis 2

2 1

Treatment effects on the means for each plot were analysed using ANOV A with regression analysis used to model trends. Models for regression analysis were selected based on biological

relevance, significance of coefficients at the 5 % level, and reduction in residual sums of squares

(RSS). The assumptions underlying regression analysis (Mead et al., 1 993) were checked using plots of studentised residuals against fitted values, normal probability plots, and Durbin-Watson statistics (data not presented).

Cultivars were not included in the experiment design due to being unable to standardise results for the different yield variables. For example, to compare marketable ear yield would require standardising for husk and rachis moisture contents which were unknown. Any comparison of cultivars in this experiment is therefore qualitative only.

All data in this and ensuing chapters were analysed using the Statistical Analysis System (SAS; SAS Institute, 1 989). All results are discussed as being significant at the 5 % level unless otherwise stated.

Adjusting data for significant block effects

Data were adjusted using indicator variables to fit a single regression line to data where block effects were significant (Colwell, 1 994). Coefficients for indicator variables from this model were then added to the original means after reversing the sign of the coefficient. The model was re-run using adjusted data but without the indicator variables. Checks were made on RSS to ensure they remained unchanged in both models. In using indicator variables in this manner, data are adjusted against a reference block. That is, the coefficient for the indicator variable indicates how much higher or lower the yields from other blocks are compared to the reference block. To avoid 'over adj usting' the data, the reference block upon which data were adjusted excluded the highest and lowest yielding blocks.

Non-linear regression

Many of the density x yield relationships were modelled using an exponential term to represent

an asymptotic tendency (Mead et al., 1 993; Equations 2. 1 and 2.2). Equation 2.3 is an'extension

of Equation 2. 1 to allow a lower asymptote (Ratkowsky, 1 990). Where an asymptote was not

evident, either the reciprocal model (Equation 2.4; Ratkowsky, 1 990; WilIey and Heath, 1 969)

or the similar Gunary model (Equation 2.5; Gunary, 1 970) were used, both of which allow

relative maxima (Ratkowsky, 1 990). The logistic model (Equation 2.6) was used to model

sigmoidal responses having a lower asymptote of zero and a finite upper asymptote (Ratkowsky,

1 990). Adding a constant to this model allowed a non-zero lower asymptote to be specified

(Equation 2.7). y '" a( 1 -e -� y = ae -k;x x y = (a +bx+cx 2) x y ::::: (a +bx+c/X) a (1 +e -b + � (2. 1 ) (2.2) (2.3) (2.4) (2.5) (2.6)

Effects of density and nitrogen rate on yield and yield components of sweet corn 23

+ d _(2.7)

To account for non-constant variance, models were weighted by the inverse of the SE of the mean for each density (Chatterjee and Price, 1 99 1 ). Coefficients of determination for these models were calculated using Equation 2.8.

1 _

(

n - 1 x RSS

)

n -p TSS

where R2 adj is the adjusted coefficient of determination; n is the total degrees of freedom; p is the number

of parameters in the model; RSS and TSS are the residual and total sums of squares for the model, respectively.

(2.8)

Independent variables in functions of ensuing graphs are expressed as plants per m2• For clarity in graphs where significant N effects were recorded, data were pooled across blocks. In these instances, pooling was conducted after analysis.

Chi-square analysis

Count data were generated when determining whether an ear was harvestable or not (i.e., barrenness; Section 2.2.5). Chi-square analysis was used to determine whether barrenness was dependent on density, N rate, or both. The chi-square statistic tests the hypothesis that the parameter estimate is zero (i.e., no linear dependence; Ott and Mendenhall, 1 990). For ease of discussion, data are presented as percentages.

Logistic regression

Logistic regression was used to investigate the relationship between whether a cob was marketable or not and the explanatory variables of cob weight and length. Assumptions

underlying logistic regression (e.g., homogeneous variation; Rosmer and Lemeshow, 1 989) were checked for each model.

Probability estimates

All harvestable ears were processed, regardless of qUality. It was noted, however, that some cobs carried pale and poorly formed kernels. Kernels recovered from such cobs were generally poorly cut and would often j am in the 'cutter' . Normally, such cobs would be discarded during

processing. Preliminary data analysis indicated that cobs giving recoveries � 70 g would have

been discarded. Modelling the relationship between kernel recovery and cob or ear weight using simple linear regression analysis enabled the cob or ear weight expected to give a recovery of

� 70 g to be estimated. These estimates were then used as criteria for determining marketable

yield of kernels, cobs, and ears.

In estimating the probability that ears, cobs, or kernel recoveries would be marketable, z-scores were calculated (Equation 2.9).

z = x - !lo

Where z' is the z-score; x is the sample mean; /Jo is a specified value; 0, is the sample standard deviation. (2.9)

The z-score statistic is normally distributed with mean zero and unit variance (Ott and

Mendenhall, 1 990). Thus, a significance level for the z-score can be derived from the normal

probability density function. The significance level is equivalent to the probability that the sample mean is greater than the specified value.

Effects of density and nitrogen rate on yield and yield components of sweet corn 25

Path analysis

Path analysis is a form of structured multiple regression analysis which can be used to investigate relationships among standardized variables (MacKay, 1 995). The advantage of such an analysis is that the effect of one variable on another can be isolated from influences of other variables. By calculating the sign and significance of path coefficients, the direct effect of each variable on another is revealed following removal of the indirect effects exerted by other variables (Li, 1 975; Wright, 1 92 1). The greater the magnitude of the path coefficient, the greater its direct effect. Path analysis has been used to evaluate yield components in many agronomic crops (e.g., Dewey and Lu, 1 959; McGiffen et al., 1 994; Pandey and Torrie, 1 973; Shasha's et al., 1 973), but is infrequently used in plant research (Hicklenton, 1 990; Karlsson et al., 1 988). The reasons for this are unclear, but may reflect the absence of path analysis routines in statistical software packages (MacKay, 1 995).

In this study, path analysis provided the opportunity to examine a possible structural relationship

between ear weights and variables influencing their weights (e.g., stalk diameter, tiller number" silk delay). A structure may offer a plausible interpretation of the relationships among these variables.

The path coefficients are equivalent to standardized partial regression coefficients. Thus, raw data were standardized to zero mean and unit variance, and multiple linear regressions, consistent with the postulated path diagrams, performed. As estimates of regression analysis are distorted if excessive collinearity exits among the independent variables in the model, appropriate measures of collinearity (e.g., variance inflation factors; Myers, 1 990) were checked for each model.

Harvestable ear weight (Section 2.2.5) was used as the response variable in mUltiple regression

models used for path analysis as using this variable avoided the problem of insufficient data for secondary ears at high densities. Thus, rather than treating a non-harvestable ear as a missing

value, it was treated as zero harvestable weight. Plants where data was missing for other variables used in multiple regression were deleted before calculating correlation coefficients.

Canonical discriminant analysis

By reducing the dimensionality of data sets, canonical discriminant analysis (CDA) identifies and summarises important differences among treatments, while recognising the complex relationships among many variables (Cruz-Castillo et aI., 1 994). In finding the linear combination of variables contributing to differences amongst treatments, CDA maximally separates groups of individuals while keeping variation within groups as small as possible.

Data for CDA were not standardized as the outcome is unaffected by the scale of individual variables (Manly, 1 986). However, because curvilinear or nonlinear relationships between two variables will not be reflected in the results of CDA unless suitable transformations are first

performed (Mathew et al., 1 994), the need for such transformations was checked by plotting each

variable pair as recommended by MacKay ( 1 995).

As with regression analysis, cases with missing data are ignored in CDA. Thus, to avoid a shortage of data at high densities, harvestable ear weight (Section 2.2.5) was used with non harvestable ears treated as zero harvestable ear weight.

In document Dry matter and nitrogen partitioning in sweet corn (Zea mays L ) for processing : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Plant Science at Massey University (Page 44-49)