Regularization - Systematic effects in strong lensing source reconstruction methods

A study of the relevance of the type of regularization is realized. First, a data set is created and maintained invariable through the tests. Different values of the regularization parameter λ are tested. Regarding its type, the most common regularizations are the zeroth order, the gradient, and the curvature [25]. I study two cases for the regularization matrix: curvature and covariance. The curvature regularization calculates derivatives between the triangles of the reconstruction in the source plane (further description in [25]). The covariance regularization uses an exponential e^−r/σ^s to correlate positions, with r the distance between pixels and σ_s the correlation length in arcsec. I use a value of σ_s= 1.

The experiment is done using as a True Source the galaxy M83 and the merger galaxy NGC2623. As the best type of regularization depends on the source distribution [24], the study of the regularization needs to be realized separately for M83 and for NGC2623. However, the preparation for both can be explained together.

I modify the level or regularization by adjusting the value of the regulariza-tion parameter, which starts with a value of 100 and decreases by a factor of 100 every test until reaching 0.0001. For some set of tests I add one more source reconstruction with a different value of λ. The comparison between models is

going to be done using the Bayes factor explained in Section 2.2.

The regularization works as a sort of prior which makes the reconstructed source smoother. Hence, it is expected to find more realistic and smoother sources when using higher values of the regularization parameter λ. At the same time, the model needs to fit the data. This usually occurs for lower values of λ. These two opposite directions lead to think about an intermediate point where the reconstruction fits properly the data while being realistic. By doing tests with λ in a range from 0.0001 to 100, I expect to have enough information to study that ”intermediate point” of regularization needed to obtain a better-reconstructed source.

Testing the curvature and the covariance matrix for the regularization with similar λ parameters can show us how important is the correlation between pixels depending on the distance. When every pixel is largely correlated with its close surrounding pixels, more structure is expected in the reconstructed source. When the correlation comprises faraway pixels, reconstructed sources are expected to be smoother.

4.3.1 M83

For the study of the regularization with galaxy M83, the results are shown in Table 4.2 and Table 4.3.

λ values regularisation ln(Evidence) log(Bayes factor) log₁₀(f )

100 -14114 -24550 >100

1 -2464 3069 >100

0.01 -520 9009 >100

0.0001 -124 9434 0

Table 4.2: M83 - curvature. Regularization value, natural logarithm of the evi-dence, and the logarithm (base 10) of the Bayes factor for reconstructions of M83 with curvature matrix regularization and different values of the regularization parameter λ.

λ values regularisation ln(Evidence) log(Bayes factor) log₁₀(f )

100 -2166 3748 >100

1 -343 7683 28

0.1 -133 7747 0

0.01 -31 7370 >100

0.0001 -6 6258 >100

Table 4.3: M83-covariance. Regularization value, natrual logarithm of the evi-dence, and the logarithm (base 10) of the Bayes factor for reconstructions of M83 with covariance matrix regularization and different values of the regularization value λ.

The plots for the reconstructions, the model and the residuals of the model are shown in Figure 4.4 for the regularization using the curvature matrix and in Figure 4.5 for the regularization using the covariance matrix.

Starting with the curvature, we can see in Table 4.2 the Bayes factor for the different values of λ. Using Kass and Raftery scale as shown in Section 2.2, the comparison is made with respect to the test with λ = 0.0001, as it is the one with the highest evidence. The analysis shows that this model is decisively supported when compared to any other. However, it can be seen that the evidence for each test increases when lowering the parameter λ. The optimum value for λ cannot be found as the evidence never decreases in our set of tests.

Anyway, the difference in the evidence value between models decreases as the evidence increases. Hence, an optimum value for the regularization parameter is expected to be somewhere below λ = 0.0001. We can match this analysis with the plot of the source reconstruction. The model with λ = 100, the least preferred model, gives a reconstruction which is too smooth, losing almost every detail in the reconstruction. Also, the residual is significantly far from the noise level. When lowering the regularization parameter, more details are found in the source reconstruction, and the residual starts to look like noise. Nevertheless, it seems to happen that between the tests with the lowest λ, the shape in the residual does not diminish as much as between the tests with the highest λ. This is correlated to the rate of change of the evidence and shows a limitation in the goodness of the source reconstruction when only changing the regularization parameter.

Analyzing the case with the covariance regularization, the model with the highest evidence has now λ = 0.1. We find smaller differences between the evidence value of each model. Comparison between models is done using Kass and Raftery scale as in Table 2.1, always with respect the model λ = 0.1. All the models are decisively discarded against the reference model λ = 0.1. The maximum value for λ is unknown, but we can assess that is between 1 and 0.01, and probably close to 0.1. Analyzing the plot, the source reconstruction always reproduces the shape of the source, including its sub-structure. The residuals become similar to noise but, again, there is always some shape that does not disappear.

To compare different regularizations, it is seen that they behave in a dif-ferent way. The curvature regularization changes its evidence much more than the covariance one. The former is decisively preferred for the optimum λ value of both. Extending this, taking the evidence for the covariance with λ = 0.1 (lnZ = 7747), and for the curvature with λ = 0.0001 (lnZ = 9434), the loga-rithm of the Bayes factor is > 100, which is decisive in favour of the curvature.

The covariance regularization shows its maximum evidence value for a higher regularization parameter than the curvature. The correlation length σsfor the covariance is 1 arcsec. Hence, its effect is smoother than in the curvature, which correlates only the pixels next to each other. As a consequence, the change in λ affects much more the curvature than the covariance. Besides, this allows the curvature to reach a better reconstruction (only for its optimum λ value) than the covariance regularization (also for its optimum λ value). The residual, for

M83 curvature

Figure 4.4: Reconstruction with curvature regularization matrix for labelled values of regularization parameter λ.

M83 covariance

Figure 4.5: Reconstruction with covariance regularization matrix for labelled values of regularization parameter λ.

both models, is never given only by noise.

4.3.2 NGC2623

The results for the galaxy NGC2623 are shown in tables 4.4 and 4.5.

In addition, the plots for the reconstruction, the model, and the residuals for this case are shown in Appendix 8. This is done because the tendency is the same as in the case of M83, and plotting it here would offer no new information.

λ values regularisation ln(Evidence) log(Bayes factor) log10(f )

100 -14569 -36989 >100

1 -3821 4850 >100

0.01 -559 11991 >100

0.001 -287 12648 42.1

0.0001 -130 12745 0

Table 4.4: NGC2623 - curvature. Regularization value, natural logarithm of the evidence and the logarithm (base 10) of the Bayes factor for reconstructions of NGC2623 with curvature matrix regulariation and different values of the regularization value λ.

λ values regularisation ln(Evidence) log(Bayes factor) log10(f )

100 -2417 6692 >100

1 -355 10857 72.1

0.1 -126 11023 0

0.01 -35 10764 >100

0.0001 -12 9909 >100

Table 4.5: NGC2623-covariance. Regularization value, natural logarithm of the evidence and the logarithm (base 10) of the Bayes factor for reconstructions of NGC2623 with covariance matrix regularization and different values of the regularization value λ.

The behavior of these reconstructions is the same as for the M83. With curvature regularization, the maximum achieved for the evidence is for λ = 0.0001. This model is always decisively preferred when Bayesian comparison is done. The difference in evidence values between two lower λ values is smaller than the difference for higher pairs of λ. Hence, a maximum is expected close to the value of λ = 0.0001. The residual behavior is also the same.

For the covariance regularization, the maximum evidence falls again in λ = 0.1. The optimum value is expected to be close to this one. This model is always decisively preferred when studying the Bayes factor. The change between the values of the evidence is again smaller than for the curvature.

For both cases, the residuals seem to reach a decreasing limit when lowering λ. If we compare the best models in both cases: curvature with λ = 0.0001 has

lnZ = 12745; covariance with λ = 0.1 has lnZ = 11023. The Bayes factor gives

>> 2 with a decisive preference for the curvature with λ = 0.0001.

Discussion

As a general discussion for this Section, it seems that lower values of λ fit bet-ter the data. However, if the value is lowered excessively, the reconstructed source starts to over-fit the data. Also, the noise starts to be absorbed in the reconstruction and the source starts looking too structured and unrealis-tic. Smoother models driven by higher regularization parameters are the worst among the tested, with lack of details and shaped residuals. The covariance regularization seems to work better than the curvature for many values of λ.

Nevertheless, the curvature reaches a better model if optimization of λ is done.

From this last point, it can be inferred that a greater correlation between close pixels is preferred over correlation with distant ones for the general case.

Hence, lower values of λ should be preferred, but maximization of its value would be the main goal for future using of the code. Also, different values for the correlation between pixels in the covariant matrix should be tested, as different regularization matrix which penalizes the correlation between long-distance pixels.

In document Systematic effects in strong lensing source reconstruction methods (Page 32-38)