2.2. Quality-by-design (QbD) methodology
2.2.4. Modelling and interpretation of data
Once an experimental design has been completed, a model equation is established and regression coefficients are predicted (Bezerra et al., 2008). In RSM, a second-order polynomial model is typically used to fit the data:
57 ΕΆ= π½0+ π΄π=1π π½πππ½+ π΄π=1π π½ππππ2+ π΄π΄π<ππ½ππππππ + Ι
where Ξ²0, Ξ²j, Ξ²jj, and Ξ²ij represent regression coefficients for intercept, linear, quadratic and
interaction terms, respectively. The response value (dependent variable) is represented by ΕΆ, and Xi
and Xj represent the level of the independent variables (factors). The term k represents the number
of factors under investigation, while Ι represents the residual error associated with the experiment. Analysis of variance (ANOVA) is then performed to determine how well the proposed model will fit the data, and to estimate how the process parameters and their interactions affect the measured responses. The experimental procedure is repeated to validate the generated model and to compare results with predicted values (BaΕ & Boyaci, 2007; Bezerra et al., 2008; Dejaegher & Vander Heyden, 2011).
Response surface plots graphically depict the responses observed due to the combined effect of multiple variables. If three or more variables are present, the plot visualisation is only possible if one or more variables are kept at a constant value. Steep slopes (i.e. rounded peaks) on the response surface are indicative of a significant effect of factors on the response, whereas a flat surface represents no significant effect. The critical/stationary point of a response refers to the optimum value in a range of tested parameters, and it can be determined with the aid of the second- order polynomial equation. It is not always possible to identify a single optimum value in RSM, in which case an optimum region of values may be indicated on the response surface instead (Bezerra et al., 2008; Granato & De AraΓΊjo Calado, 2014).
Figs. 2.10a and 2.10b depict response surfaces where the maximum response is located within the experimental design space. Fig. 2.10b differs in that a plateau is present in relation to variable X2, which indicates that adjustment of its levels does not affect the level of the response (y).
The response surface in Fig. 2.10c depicts a situation in which the maximum response does not lie entirely within the experimental region. The experimental design would have to be displaced to attain a maximal response, i.e. extended ranges of the independent variable would have to be incorporated in the experimental design. Fig. 2.10d shows a minimum point located within the experimental domain, and Fig. 2.10e depicts a saddle point, which represents an inflexion point between a relative maximum and a relative minimum. Saddle point coordinates do not serve as valid optimal values when the intent is to obtain a minimum or maximum response in a system (Bezerra et al., 2008).
58
Figure 2.10 Examples of hypothetical response surface plots obtained in the optimisation of two variables, x1
and x2, depicting (a) maximum, (b) plateau, (c) maximum outside the experimental region, (d) minimum, and
(e) saddle surfaces (Bezerra et al., 2008).
Zou et al. (2013) successfully applied RSM to optimise the extraction of astaxanthin from Haematocoocus pluvialis. Amongst others, the combined effect of the solvent composition (axis A: Ratio of ethanol to ethyl acetate) and extraction temperature (axis B) on the astaxanthin yield (vertical/y-axis) was visualised on a three-dimensional response surface plot which showed that a maximum response was attained within the experimental domain (Fig. 2.11).
59
Figure 2.11 Response surface plot showing the effect of two process variables on the yield of astaxanthin from ultrasound-assisted extraction of Haematococcus pluvialis (Zou et al., 2013).
Response surfaces may also be depicted as two-dimensional contour plots, on which plotlines in close proximity (i.e. darker areas) indicate that slight changes to the input factors are associated with significant changes in the response value. Elliptical contour plots represent significant interactions, while circular plots represent non-significant interactions (Steinberg & Bursztyn, 2010). Wen et al. (2015) optimised microwave-assisted extraction of anthocyanin from blackberry, using two-dimensional contour plots to illustrate the effect of four process variables on the extraction efficiency. Fig. 2.12 shows the effect of microwave power (X1) and microwave time (X4) on the
anthocyanin yield at a fixed solvent concentration and liquid-solid ratio. The elliptical shape of the contour plot indicates a significant quadratic effect, and the maximal response was located within the experimental design space where the microwave power was ca. 430 W and the microwave time was ca. 2.8 min.
Response surfaces are typically also presented as three-dimensional (3D) surface plots combined with their corresponding two-dimensional (2D) contour plots (Fig. 2.13). Li et al. (2012) optimised the extraction of mycelial polysaccharide from Fusarium oxysporum and used combined plots to show the effect of process variables X1 (extraction time) and X2 (extraction time) on the
extraction yield. The elliptical shape of the contour plot indicates that there was a significant interaction between these two variables, and the optimal ranges were identified as 1.40β2.39 h (extraction time) and 89.92β99.35 Β°C (extraction temperature).
60
Figure 2.12 Elliptical contour plot showing combined effect of microwave power (X1) and microwave time
(X4) on extraction yield (Wen et al., 2015).
Figure 2.13 Combined response surface plot (a) and corresponding contour plot (b) showing effects of two
variables (X1 = extraction time; X2 = extraction temperature) on extraction yield (Li et al., 2012).
Fig. 2.11, discussed previously, is also an example of a combined plot, comprising a 3D fitted surface with an underlying contour plot. In this instance, the levels of the two independent variables were presented in coded form, with -1, 0 and +1 representing the minimum, centre point and maximum values of the variables, respectively.
The significance of linear, quadratic and interaction effects can also be graphically demonstrated using standardised Pareto charts. Horizontal bars represent various factors and their interactions, and those bars which intersect the vertical line represent significant effects at a 5% level of significance (P = 0.05). The length of a horizontal bar is proportional to the magnitude of its estimated effect, while a negative value bar length indicates a negative effect on the measured
61 response (Das et al., 2014). Khan et al. (2010) investigated the effects of sonication power (P), extraction temperature (T) and ethanol: water ratio (E) on the efficiency of an ultrasound-assisted extraction process for orange (Citrus sinensis), and presented their ANOVA data in a standardised Pareto chart (Fig. 2.14).
Figure 2.14 Standardised Pareto chart for total phenolic content at 30 min extraction (Khan et al., 2010).
The magnitude and significance (P < 0.05) of the linear, quadratic and interaction effects of the three tested factors on the total phenolic content (TPC) of the extract, at a fixed extraction time of 30 min, was graphically depicted on the Pareto chart. Six out of the nine horizontal bars representing these terms crossed the vertical blue line which denotes the 5% significance level. The linear effects of P, T and E had the most significant positive effect on the response, as reflected in the relative size of their bars and the standardised effect values (8.96β16.80). The interaction of sonication power and ethanol: water ratio (PE) and the quadratic term of extraction temperature (TT) both had significant negative effects on the TPC, and the interaction of extraction temperature and sonication power (TP) had a significant positive effect on the response, but these effects were considerably smaller than those of the three linear terms. The three terms represented by the remaining three bars on the Pareto chart did not have significant effects on the response.
If several responses have to be optimised simultaneously, then multi-criteria methodology, like desirability profiling, can be used. This method determines the levels of factors that result in maximum overall desirability for the process in terms of output (Bezerra et al., 2008). Since factors can have opposite effects on the measured responses, an optimal compromise has to be made. The desirability function for each response is determined by assigning a dimensionless score to the predicted values, ranging from 0 (very undesirable) to 1 (very desirable). This allows for the calculation of an overall desirability function by applying the geometric mean, after which an
62 algorithm is applied to the desirability function to obtain the set of variable values which maximises it (Ferreira et al., 2007; Bezerra et al., 2008).
Bosman (2014) applied multi-response desirability profiling in the simultaneous maximisation of extract and mangiferin yield in an ethanolic extraction process for honeybush (Cyclopia genistoides). Compound prediction profiles were generated which show the effect of the two independent variables under investigation (extraction temperature and ethanol concentration) on the desirability of predicted extract and mangiferin yield (Fig. 2.15). Blue horizontal lines on the prediction profiles indicate 95% confidence intervals which aid in the assessment of prediction reliability. Vertical red lines intersecting the x-axes and apices of the desirability curves (green) indicate the levels of the independent variables which would result in the most desirable (i.e. maximum) extract and mangiferin yields. In this case, the optimal levels were an extraction temperature of 70 Β°C and an ethanol concentration of 40%.
Figure 2.15 Multi-response desirability profiles for maximum extract and mangiferin yield in ethanolic extraction process for Cyclopia genistoides (Bosman, 2014).
63