CHAPTER 4. STATISTICAL METHODS FOR QUANTIFYING
4.5 Implementation
4.5.2 Results
We show results identifying Li dendrite formation in each of the three videos for the linear and quadratic GMRM, and the linear and quadratic BGMRM methods, each with K = 4 components. The ECM algorithm for the GMRM and the spatial correction are coded in C++ and implemented in R (R Core Team, 2013) using the package Rccp (Eddelbuettel and Fran¸cois, 2011). These models are therefore reasonable to estimate even on 100% sampled images on a regular laptop. For sparsely sampled images, computation for full analysis (from raw image to final Li summaries) on a single image takes seconds or less, depending on the amount of growth present, on a 2.6 GHz i5 MacBook Pro. Table 4.1 gives average computation times (seconds) to estimate the quadratic GMRM and perform the spatial correction on the full image and considered sparse sampling rates for Frame 59 of Video 1. Computation time will vary depending on the number of iterations required to reach convergence in the ECM algorithm, and the number of pixels labeled as Li that must be processed for the spatial correction.
We are currently in the process of coding the BGMRM process in C++, so the model is implemented in R and estimation of the bivariate models is computationally slow on full
Table 4.1: Average computation times (in seconds) to estimate the quadratic GMRM and to perform the spatial correction, for Frame 59 of Video 1 at various sampling rates.
ρ ECM ECM + Correction
1 46.14 122.48
0.1 3.04 3.91
0.05 0.56 0.77
0.01 0.22 0.22
images. For this reason, we only illustrate the use of the BGMRM models on sparsely sampled images in this chapter. We do however utilize the univariate GMRM to compare estimates of proportion of growth from sparsely sampled images compare to estimates from the full images. Additionally, we only analyze images during intervals of each video when changes in growth are occurring during charging and discharging. We consider frames t = 52, . . . , 76 for Video 1, frames t = 46, . . . , 68 for Video 2, and frames t = 59, . . . , 86.
Sampling rates of ρ = 0.01, 0.05, 0.10 are chosen to assess the methodology on small sampling rates, and the specifications of r and γ for the spatial correction for each of the the sampling rates are given in Table4.2. At the lower sampling rates of 1% and 5%, we found we needed to increase the radius of the neighborhood and decrease the cutoff for proportion of neighbors classified as growth because the distance between sampled pixels can be large.
Table 4.2: Choices of r and γ for each of the four sampling rates.
ρ r γ
1 5 0.4
0.1 10 0.4 0.05 10 0.2 0.01 20 0.2
Figure4.20shows the final labelings of growth for the dark field images originally shown in Figure 4.1 using the linear GMRM on 10% sampled images. The methodology appeared to classify pixels as Li growth well in Video 1, but did not perform as well on images where the majority of the frame contained growth in Videos 2 and 3. An example of poor performance of the methods is shown in the bottom center image of Figure 4.20.
Figure 4.20: Li labeling for the images from Figure 4.1 using the linear GMRM on 10% sampled images.
The initial labeling was spatially corrected with r = 10 and γ = 0.4
When growth dominates the image, it is very difficult to estimate the underlying trend.
Particularly with the quadratic model, growth centered in the middle of the image can result in overestimation of the curvature of the trend, discussed in Section4.4.1. When this occurs it is difficult to identify pixels corresponding to less dense areas of Li growth where grayscale values are in not as high contrast to electrolyte pixels.
We illustrate a comparison of the linear GMRM, quadratic GMRM, linear BGMRM and quadratic BGMRM on 10% sampled images using growth curves in Figure4.21. In Video 1, the linear GMRM and both BGMRM models provide very consistent growth curves. The quadratic GMRM is less consistent, and systematically estimates a lower proportion of Li at peak growth of the experiment. This is likely due to the estimation difficulties of the quadratic model, where electrolyte pixels in the top center of the image are misclassified as Li growth and Li growth at the right and left edges are missed. The quadratic BGMRM model does not appear to suffer as badly from this issue in this case. In Videos 2 and 3, it is not clear which models perform better, as again we do not have knowledge of what the “true” growth curves should look like. Variability in estimated proportion of growth
between sequential images in Video 2 seems to be much larger in results from the bivariate models. In Video 3, both quadratic models estimate much higher proportions of growth at the during charging than the linear models.
Figure 4.21: Proportion of growth curves for the four models on images sampled to 10% for the three videos.
Only images that cover the time where Li growth is evolving are segmented.
We explore the effects of sparse sampling on proportion of estimated growth by compar-ing estimated growth from the sparsely sampled images to estimates from the full images, for both the linear and quadratic GMRM models. In Table4.3 we compare estimates of the RMAD statistic for the linear and quadratic models for sampling rates of ρ = 0.01, 0.05, 0.1.
Unsurprisingly, RMAD values generally increase as sampling becomes more sparse. The dif-ference in deviation moving down from 10% to 5% is consistently much smaller than moving from 5% to 1%, and deviation increases consistently from Video 1 to Video 3. The largest discrepancy occurs in Video 3 at 1% sampling for both linear and quadratic GMRMs, where the proportion of growth estimated at 1% differs from that estimated from full images by over 15%, on average. Figure 4.22 illustrates how proportion of growth curves estimated using the linear GMRM differ across sampling rates. At 10%, the curves follow the pattern from fully sampled images fairly closely. Curves from 5% sampled images appear to follow similar patterns with slightly more image to image variability. Estimates from 1% sampled images are consistently poor particularly in Video 3. It is not surprising that estimates of proportion of growth from 1% sampled images are poor as coverage of the image is very low.
It is however promising that estimates from 5 and 10% sampled images can still roughly characterize behavior seen from analyzing full images.
Table 4.3: RMAD values for the three videos for sampling rates 0.1, 0.05, and 0.01 using proportion of growth estimated with linear GMRM and quadratic GMRM models.
Linear GMRM Quadratic GMRM
ρ V1 V2 V3 V1 V2 V3
0.1 0.017 0.027 0.046 0.029 0.038 0.131 0.05 0.030 0.040 0.046 0.019 0.049 0.102 0.01 0.058 0.061 0.162 0.049 0.059 0.188
Figure 4.22: Comparison of growth curves at different sampling rates using the linear GMRM.
Large jumps or dips in estimated proportions of growth can also be due to failure of the component merging algorithm rather than poor image segmentation from the model.
If the reference region chosen covers too many growth pixels, a component segmenting Li pixels can be chosen as the electrolyte component (kele). Then a large portion of pixels will be incorrectly classified as non-growth, whereas a better reference region would classify the pixels correctly to growth. The opposite situation can arise where more than one component covers electrolyte and the merging algorithm chooses the component centered at lighter values as kele. This will cause a large portion of electrolyte pixels to be misclassified as growth. These failures of the merging algorithm tend to occur more often in the very sparsely sampled images (e.g. 1% and 5%) because there are less pixels to estimate the average.
4.6 Discussion
In this chapter, we presented a fast, general methodology for identifying Li growth from (S)TEM images using Gaussian mixture of regression models. Throughout the chapter we explored several forms of GMRMs for image segmentation, and ultimately determined that the univariate GMRM on dark field images with a global regression function was most useful for identifying Li growth for these experiments. The bivariate models utilizing both dark and bright field images did not show much improvement over the univariate methods, as the bright field images did not carry much unique information. However, we believe that the BGMRM would be beneficial in experiments where bright and dark field images carry more distinct information, which can occur in certain types of electrochemical experiments where, for example, features large enough to cast shadows are present which are better identified in bright field images. The univariate GMRM methodology can be implemented in real or near-real time on full or sparsely sampled images. In fact, it has now been integrated into an online system by scientists at PNNL, which is currently utilized to track Li growth in new (S)TEM experiments in real-time. The methodology can also be applied to various other types of nano-scale electrochemical experiments other than those analyzed in this chapter (the results of such experiments are unpublished so we are unable to discuss these here).
However, the methodology is not without its shortcomings, which we have discussed throughout the chapter. For example, the choice of the number of components, K = 4, was relatively ad hoc, as is often necessary in mixture models, and did not always work well in the beginning of the videos were little growth was present (Videos 1-2). For these images, a GMRM with two or three components is more effective because the redundant components in the K = 4 model significantly overlap causing difficulty identifying the
“electrolyte” component in the merging algorithm. Unfortunately, selecting the optimal number of components for each image is tedious and negates the use of the methodology in real time. Additionally, a major challenge of the proposed methodology arose in robustly
modeling the background gradient present within the images. The quadratic function of the x-coordinate seemed to provide a more reasonable model for the form of the background trend than the linear form, or a more general spline functional form, but the position of the anode in the images often resulted in overestimation of the trend. Because the x values, the horizontal coordinates of an image, were defined as positive (xi ∈ {0, . . . , 255}), the behavior caused by the position of the anode in estimating a quadratic trend corresponds to a negative estimated coefficient on the linear term of the quadratic polynomial (β1x + β2x2).
This suggests that more robust estimation of a quadratic trend in these situations could be obtained by constraining the effect of the linear component of the polynomial to be positive.
Unfortunately, while the method of estimating the GMRM by maximum likelihood with the ECM algorithm is computationally fast, it does not lend well to such a constraint because the resulting optimization step of β is no longer a linear optimization problem, but a linear programming problem. We found that the simplest solution of attempting to find starting values which result in the algorithm converging to a local model where the quadratic trend is estimated reasonably could be extremely difficult and impossible in some images, with no universally good method for finding starting values being easily attainable. A natural, and perhaps more intuitive, solution is to formulate the GMRMs in a Bayesian framework, and impose appropriate constraints on trend coefficients through appropriate priors, with the caveat that only near real-time analysis may be attainable with a Bayesian model.
In Chapter 5 we extend upon our current methodology by implementing linearly con-strained Gaussian mixture of regression models under the Bayesian framework to address some of these challenges. This methodology allows for robust estimation of the quadratic trend, “automatic” selection of the number of components regardless of the amount of Li present within an image, and additional measures of uncertainty about proportion of growth estimates and probabilistic assignment of Li labelings to pixels.