Chapter 3 Semi-Empirical Sputter Model
3.2 Sputter Parameter Optimization
Motivation and Experimental
In the previous section, a semi-empirical model that accurately predicts the deposi- tion rate from sputtered materials was developed and validated [13]. While the model was successful at computing the deposition rate of materials given specific deposition conditions, the true purpose of the model is to solve the reverse problem. Ideally, the model would assist in the determination of deposition conditions to create a desired composition/thickness spread. One approach to attaining such a goal would be to reorganize the model equations to directly predict sputtering conditions for a given deposition rate spread if it was possible to do so. Currently, there is no clear path- way to perform such a rearrangement because of the non-linear nature of the model equations. Another solution would be to implement an optimization procedure of the sputtering conditions. This solution is viable due to the relatively short calcu- lation time of the model (usually less than 1 sec per model). Implementing such a optimization procedure would reduce the time and effort of determining experimental sputtering conditions.
To implement such a procedure, the specific variable that are to be optimized, the optimization method, and an appropriate objective function were determined, as outlined in section 3.2. A 2-tier validation of the optimization using experimentally measured thickness spreads from a sputtered copper gun was performed. Finally, the associated response surfaces of the model were analyzed. This analysis is impor- tant, so that obstacles such as existence of multiple local minima could be identified and proper methods of handling these obstacles could be implemented, and further validation was performed [9].
The performance of the optimization procedure was validated by using measured thickness profiles from three DC sputtered copper thin-films on 7.62 cm circular Si wafers. Table 3.3 shows the sputter conditions for each of these thin-films. After deposition, profilometer measurements of the thickness of the films were taken at 24 points across the wafer. These thickness measurements were then converted to deposition rates by dividing them by the deposition time. After the deposition rates were determined, they were used in the 2-tier validation of the optimization procedure.
Table 3.3 The experimental conditions used to sputter the Cu films to validate the
optimization procedure. In the last two rows cmin and cmax show the minimum and
maximum allowable value for the parameter of the corresponding column. *The gun tilts are stated with respect to the vertical direction between the sputter gun face and the substrate.
Deposition Sputter Gun Sputter Gun Sputter Gun Deposition Pressure Power Tilt to Substrate Time
(Pa) (W) (Degrees)* (mm) (sec) Condition 1 0.667 75 62.0 101 2177 Condition 2 0.667 70 62.0 111 2188 Condition 3 0.933 100 59.0 101 1424
cmin NA 25 52.4 81 NA
cmax NA 500 88.0 121 NA
The first evaluation tested the performance of the optimization procedure against simulated deposition rate spreads. The goal of this evaluation was to compare the optimization results to an idealized case, where effects from model error and mea- surement uncertainty would not bias the results. In this evaluation, the measured deposition rates were used to calibrate the sputter model. The calibrated model was then used to predict the deposition rates across a 7.62 cm circular wafer. The op- timization procedure was then used to determine sputtering conditions that would yield the simulated results without being given any information about the deposition conditions used to simulate the film. The resulting optimization predicted deposition conditions were compared to the experimental conditions.
The second evaluation method used the optimization procedure to identify the de- position condition directly from the experimentally measured deposition-rate spreads. The general procedure for this evaluation is exactly the same as the first evaluation method, expect the experimentally measured deposition-rate spreads were used in- stead of a modeled spread. In this evaluation, model errors as well as non-idealities of the experimentally measured deposition rate, such as noise in the measurements, are present in the desired spread. The effects of such factors render perfect agreement between the experimentally deposition rate distribution and the predicted distribu- tion from the optimization unlikely. This evaluation tests the ability and sensitivity of the algorithm to determine sputtering conditions for non-ideal deposition rate dis- tributions. The details and results of this 2nd validation are given in section A.5 of the appendix. This validation was left out of this chapter because the 2nd validation was only used to test the sensitivity of the optimization to potential measurements errors, and does not assist in the validation of the optimization procedure.
A full analysis of the results and performance of the optimization for experimental condition 1 will be given in section 3.2, and analysis of conditions 2 and 3 are in ref. [9].
Theory
A Nelder-Mead optimization routine was implemented [85]. Nelder-Mead was chosen due to it is a derivative-free optimization technique that performs well for multi- dimensional optimization. This routine was used to determine the sputter-gun tilt, gun power, and distance between the sputter gun and substrate, referred to here as the substrate height, for the deposition of a desired thickness gradient of copper. These three parameters were chosen to be optimized because they are the commonly changed experimentally to tune the deposition rate spread across samples. Copper was chosen because is it a well studied system, and because it was the material used
to validate the sputter model, as shown in section 3.1. The objective function for the optimization was the Frobenius-norm between the desired deposition rates, Tdesired,
and the predicted deposition rates, Tpredicted. For the calculation of the objective
function, the deposition rates were used to populate matrices. With these matrices, objective function was calculated as:
O(p, t, h) =||Tdesired−Tpredicted(p, t, h)||F (3.31) Where O(p, t, h) is the objective function evaluated at the gun power, p, gun tilt, t, and substrate height, h. Soft constraints, through the use of penalty func- tions, were added to the calculation of the objective function to account for physical constraints of the deposition parameters. The resulting soft-constraints objective function, Oconstrained(p, t, h) was calculated with:
Oconstrained(p, t, h) =O(p, t, h) +P (p) +P (t) +P(h) (3.32)
Where P(p), P (t), and P (h) are the penalty functions associated with the gun power, gun tilt and substrate height, respectively. The general form of the penalty function for a sputtering variable cwas be calculated by:
P (c) = 0 ; if cmin ≤c≤cmax q (cmin−c)2 ; if c < cmin q (cmax−c)2 ; if cmax < c (3.33)
Where cmin and cmax are the minimum and maximum allowable value of the
sputtering variable c.
Analysis of Response Surface Plots
The input response surfaces were evaluated to identify potential obstacles that could exists during optimization. Fig. 3.6 shows an example response plot which was
used during this analysis. As shown in Fig. 3.6 B)-D), 2 dimensional sections of the 3 dimensional contour plots were used to simplify analysis and to overcome data presentation issues. During this analysis, multiple local minima were observed with respect to the sputter-gun tilt versus substrate height, shown in the bottom right corner of Fig. 3.6 D). Furthermore, the gradient of the objective function around the minimum becomes very small in all of the response plots. These two issues could lead the optimization procedure to terminate at non-optimal solutions.
Figure 3.6 Contour plots of the objective function with respect to the optimized
sputtering variables, substrate height, gun power, and gun tilt. (B-D) 2D slices of the full contour plot shown in (A). The substrate height was set to 101 mm (B), the
gun tilt was set to 62◦ (C), and the gun power was set to 75 W (D).
To overcome these potential issues, a "seeded" multi-start optimization method was devised. Fig. 3.7 shows a general outline of the procedure on a simplified response
surface. During the initialization of the optimization procedure, nseed randomly se-
lected values were used to populate the search domain. The values of each nseed
variable was constrained to be within the physically constrained limit of the opti- mized variables, i.e. the gun power, gun tilt, and substrate height. The value of the objective function, referred to as the objective value, for each nseed point was calcu-
lated, as shown in Fig. 3.7, Step 1. After all objective values were calculated, the point with the minimum objective value was chosen to be the initial location for full optimization, as shown in Fig. 3.7 Step 2. The rest of the objective values were then reweighed by harshly penalizing objective values from numerically similar sputtering conditions, shown in Fig. 3.7 Step 3. This process was repeated until annstartnumber
of points were selected for a multi-start optimization. For the multi-start optimiza-
tion, the condition from each nstart was used to perform a full optimization. The
results from each of these optimizations were then analyzed, and the optimizations with the lowest objective values were used for further analysis.
Optimization of a Model Thickness Gradient
First, the performance of the optimization procedure to determine the sputtering conditions from an ideal sample was evaluated. For this evaluation the sputter model was used to simulate the deposition rate distribution using the experimental deposi-
tion conditions shown in Table 3.3, Condition 1. The effect of the number of nseed
and nstart points for optimization performance was evaluated. For this evaluation,
5, 20, 40, 60, 80, and 100 nseed points for 1, 2, 3, 4, and 5 nstart points multi-start
optimizations were performed. The procedure using each nseed and nstart combina-
tion was repeated 300 times. The calculations were repeated due to the randomness associated with the selection of the nseed parameter values. This results from these
repetitions were used to evaluate the average performance of the algorithm. From each optimization, the predicted deposition parameters and calculation time were
Figure 3.7 A diagram outlining the steps for the seeded multistart algorithm that was used for the optimization procedure. Due to the complexity of the response plots in this study, a representative response function is shown instead of a true response plot for the sake of clarity.
recorded. The predicted deposition parameters were compared to the initial exper- imental conditions, and the relative error, REc, of each sputtering variable, c, was
calculated:
REc=
copt−cdes
cdes
×100% (3.34)
Where copt was the value of the sputtering variable c that the optimization algo-
rithm determined and cdes was the value of the sputtering variable c used to create
the desired deposition rate profile.
A unified binary metric was created to determine if an optimization was successful. For this binary metric an optimization passed if all sputtering variables were within 1 relative percent error of the experimental sputtering parameters, otherwise the
optimization failed. Using this metric, the pass rate,P R(nseed, nstart), was calculated
using:
P R(nseed, nstart) =
tpassed(nseed, nstart)
ttotal(nseed, nstart)
(3.35) Where tpassed(nseed, nstart) and ttotal(nseed, nstart) were the number of passed trails
and total trails for a of nseed and nstart, respectively. The average calculation time
from each optimization trails was also determined. Fig. 3.8 shows the results of these calculations
Figure 3.8 (A) The calculated pass rate of the optimization algorithm using
different values ofnstart and nseed. (B) The average calculation time of the optimization algorithm using different values of nstart and nseed. In (B) the lines associated with the average calculation time usingnstart of 2 and 4 are not shown for clarity.
Fig. 3.8 A) shows that the pass rate for nstart ≥ 3 was greater than 97% for all values of nseed; reaching a pass rate of greater than 99 % when nseed = 100. It also
shows that pass rate fornstart = 2 and anseed = 5 yielded a pass rate of 92%. As the
nseed value was increased from 5 to 100, the pass rate increased to 99%. The set of
single-start optimizations, nstart = 1, have significantly lower performance than the
multi-start optimization. The single-start optimizations have a pass rate of 75% when
This shows that that the issue of multiple minima and a variable gradient response surface probably detrimentally affected the single-start optimizations. Additionally due to the relatively high pass-rate from all of the multi-start optimizations, it shows the seeded multi-start procedure that was used overcame these issues. It should be noted that while several conditions resulted in reported pass rates of 100%; it should not be concluded that those conditions will always yield a pass. Instead, a 100% pass rate indicates that to find the true pass rate more than 300 trails would need to be performed.
The average calculation times for the optimizations using nstart of 1, 3, and 5 are
shown in Fig. 3.8 B). This figure shows that the calculation time increases as nstart
increases, but the calculation time is relatively constant as a function of nseed. The
reason that the optimization is relatively constant withnseedis due to the fact that as
nseed is increased the initial conditions for each nstart optimizations are likely closer
to a minimum. Therefore, the average time for each optimization is lowered because iterations of the optimization are needed before terminating on a solution.
From this stage of the validation, it can be concluded that by performing the optimization procedure with nstart = 2 and nseed = 100 or nstart ≥ 3, there is a 99% confidence that a passable sputtering condition will be determined. Furthermore, this optimized sputter conditions will be determined within 29.8 seconds - 72.5 sec- onds, which is 3-4 orders of magnitude faster than determining the sputter conditions without the optimization.