2.4 Further issues related to this work 2.4.3 Methods to report the results of different analysis strategies One of the challenges of dealing with uncertainty, discussed in section 2.4.2, is the reporting of different types of uncertainty. Therefore, straightforward methods to quantify and illustrate uncertainty are needed. In this short overview, I will outline some of the methods suggested to report epistemic uncertainty, including the vibration of effects approach, which is extended and used in contributions 3 and 4. I will also briefly discuss some advantages and disadvantages of these approaches. Computational model robustness Computational model robustness was developed to estimate all possible models from a theoretically informed model space (Muñoz and Young, 2018; Young, 2018). As a model space, the authors do not only consider specific choices in a probability model, but suggest extending the model space for instance to variable definitions or software implementations. Therefore, they address what we denote as method uncertainty in their work. Young (2018) extensively discusses how to define such a model space and presents three approaches. The first approach is motivated by the idea that all models an analyst considered as worth running during the study are worth reporting. An ‘uber log file’ could theoretically save all these models. The second approach, denoted as the ‘task force approach’, combines a wide range of expert opinions, which is close to the crowdsourcing approach of Silberzahn and Uhlman (2015). As a third strategy, the authors suggest combining the uber log file and the task force approach. Taking all the models into account, a ‘modeling distribution’ can be calculated and visualized with kernel density graphs. In such a figure, a favorite model can be indicated. Specification Curve The specification curve analysis was proposed and practically illustrated by Simonsohn et al. (2015) in the field of social science. It considers all operationalization decisions in the data analysis as specifications, and thus addresses what we denote as method uncertainty. Conducting a specification curve analysis can be summarized in three steps: First, all reasonable specifications have to be found, second, all of these specifications have to be calculated, and third, a joint permutation test is performed to test the null hypothesis of no effect. For illustration, Simonsohn et al. (2015) suggest a two-paneled figure, with an upper part showing a ‘curve’ of estimated effects and specification numbers. In this curve, a clear distinction between negative and positive estimates can be made, and significant estimates can be highlighted. In the lower panel, information about the decisions that produce the estimates can be found. A practical application of specification curve analysis was provided by Rohrer et al. (2017) in psychological research, who investigated birth-order effects on personality traits. Multiverse analysis The multiverse analysis was suggested by Steegen et al. (2016) in psychological research with the aim of performing a statistical analysis for different data pre-processing steps. To ensure that the alternative data sets cover reasonable choices, they base their practical application on previously published studies, where these choices have actually been considered. In order to visualize the results, they suggest showing a histogram of raw p-values. The distribution of p-values obtained by different data pre-processing choices can give information on the robustness of findings due to alternative choices: p-values which are nearly uniformly distributed are not as robust as p-values that indicate increased significance. Furthermore, for a more detailed investigation, results can be reported in grids of p-values, where a p-value can be traced back to the analysis strategy that yielded it. In addition to the practical illustration of Steegen et al. (2016), applications of a multiverse analysis can for instance be found in McBee et al. (2019), Stern et al. (2019), or Credé and Phillips (2017). Vibration of effects The concept of vibration of effects was initially proposed by Ioannidis (2008) and extended by Patel et al. (2015), who used it to practically examine model uncertainty in a large epidemiological study. The developers suggest visualizing results obtained from different analysis strategies with volcano plots. These plots typically show p-values on the y-axis and effect estimates on the x-axis. Moreover, the variability of p-values and effect estimates can be quantified through summary measures. As such, Patel et al. (2015) suggest relative effect estimates and relative p-values, defined as the ratio of the 99th and 1st percentile of effect estimates and the difference between the 99th and 1st percentile of -log10(p-value), respectively. Apart from these primal works, applications of the framework can be found in Palpacuer et al. (2019) and Chu et al. (2020) for different method choices. In our work, we will use and extend the vibration of effects framework in order to assess and compare measurement, sampling, model and data pre-processing uncertainty. Moreover, we will apply it for different types of regression (logistic regression (section 2.3.2) and Cox regression (section 2.3.4)), which results in relative odds ratios and relative hazard ratios as summary measures in order to quantify the variability of effect estimates. Discussion of advantages and disadvantages In contrast to computational model robustness and the vibration of effects, specification curve analysis and multiverse analysis allow easy tracing back of results to the corresponding analytical choices. This, however, results in the disadvantage of there being a limited number of models that can be considered for the visualization. For a large number of analytical choices, this can for instance be accounted for by visualizing only a subset of these decisions. Furthermore, when conducting a multiverse analysis, the focus of visualization can be the histogram rather than the grid of p-values. Similarly, for a specification curve analysis, only the upper panel of the suggested figure can be shown. With regard to the other approaches, the specification curve analysis implicates a permutation test, which provides a decision over all specifications. However, performing such a test is very computationally demanding and its application has not yet gained ac- ceptance in practice. On the other hand, the vibration of effects framework encompasses relative effect estimates and relative p-values as summary measures of the variability of results. Yet, neither these summary measures nor the permutation test are in principle limited to their specific framework of visualization. In general, none of the approaches are limited to the type of uncertainty for which they were originally suggested. Using the specification curve only for data pre-processing or model choices is straightforward, and in a multiverse analysis, decisions on model specification can be similarly included to data pre-processing choices. Finally, the vibration of effects framework can be extended to sampling, data pre-processing and measurement uncertainty, as we demonstrate in contributions 3 and 4. In contrast to the other approaches, this framework provides visualization of effect estimates and p-values simultaneously. Moreover, for epistemic uncertainty, it allows the highlighting of points in volcano plots in order to visualize the impact of particular choices. Thus, key choices can be identified. In document Addressing the challenges of uncertainty in regression models for high dimensional and heterogeneous data from observational studies (Page 36-38)