• No results found

Statistics Panel This panel displays some relevant summary statistics

Data sets and SGeMS EDA tools

3. Statistics Panel This panel displays some relevant summary statistics

In the lower part of the main interface, there are two buttons: Save as Image and Close. The Save as Image button is used to save a graphical result (for example a histogram) into a picture data file in either “png”, “bmp” or “ps” (Postscript) format. The Close button is used to close the current interface.

Parameters description

The parameters of the “Display Options” page are described below.

• X Axis Controls for the X axis for variable 1. Only the property values between

“Min” and “Max” are displayed in the plot; values less than “Min” or greater than “Max” still contribute to the statistical summaries. The default values of

“Min” and “Max” are the minimum and maximum of the selected Property. The X Axis can be set to a logarithmic scale by marking the corresponding check box. This option is valid only when all the property values are larger than zero.

• Y Axis Controls for the Y axis for variable 2. The previous remarks apply.

The user can modify the parameters through either the keyboard or the mouse.

Any modification through the mouse will instantly reflect on the visualization or the summary statistics.

Warning: the change through the keyboard must be activated by pressing the

“Enter” key.

4.2.2 Histogram

The histogram tool creates a visual output of the frequency distribution, and dis-plays some summary statistics, such as the mean and variance of the selected variable. The histogram tool is activated by clicking Data Analysis → Histogram.

Although the program will automatically scale the histogram, the user can set the histogram limits in the Parameter Panel. The main histogram interface is given in Fig.4.7, and the parameters of the Data page are listed below.

2

1

3

Figure 4.7 Histogram interface [1]: Parameter Panel; [2]: Visualization Panel;

[3]: Statistics Panel

Parameters description

• Object A Cartesian grid or a point-set containing the variables under study.

• Property The variable to study.

• Bins The number of classes. The user can change this number through the key-board, or by clicking the scroll bar. Any value change will be instantly reflected on the histogram display.

• Clipping Values Statistical calculation settings. All values less than “Min”

and greater than “Max” are ignored, and any change of “Min” and “Max” will

4.2 The SGeMS EDA tools 87 affect the statistics calculation. The default values of “Min” and “Max” are the minimum and maximum of the selected Property. After modifying “Min”

and/or “Max”, the user can go back to the default setting by clicking “Reset”.

• Plot type The user can choose to plot a frequency histogram (“pdf”), a cumulative histogram (“cdf”) or both.

4.2.3 Q-Q plot and P-P plot

The Q-Q plot compares equal p-quantile values of two distributions; the P-P plot compares the cumulative probability distributions of two variables for equal thresh-old values. The two variables need not be in the same object or have the same number of data. The Q-Q plot and P-P plot are combined into one program, which can be invoked from Data Analysis → QQ-plot. This EDA tool generates both a graph in the Visualization Panel and some summary statistics (mean and variance for each variable) in the Statistics Panel, see Fig.4.8. The parameters in the “Data”

page are listed below.

Parameters description

• Analysis Type Algorithm selection. The user can choose either a Q-Q plot or a P-P plot.

• Variable 1 The variable selection for the X axis. The user must choose first an object, then the property name.

• Clipping Values for Variable 1 All values strictly less than “Min” and strictly greater than “Max” are ignored; any change of “Min” and “Max” will affect the statistics calculation. The user can go back to the default setting by clicking “Reset”.

• Variable 2 The variable selection for the Y axis. The user must choose first an Object, then the Property name. Note that Variable 2 and Variable 1 might be from different objects.

• Clipping Values for Variable 2 Remarks similar to those for Clipping Values for Variable 1.

4.2.4 Scatter plot

The scatterplot tool (executed by clicking Data Analysis → Scatter-plot) is used to compare two variables by displaying their bivariate scatter plot and some statistics.

All available data pairs are used to compute the summary statistics, such as the correlation coefficient, the mean and variance of each variable (see part [C] in Fig.4.9). To avoid a crowded figure in the Visualization Panel, only up to 10 000 data pairs are displayed in the scatter plot. The parameters in the “Data” page are listed below.

2

1

3

Figure 4.8 Q-Q plot interface [1]: Parameter Panel; [2]: Visualization Panel; [3]:

Statistics Panel

Parameters description

• Object A Cartesian grid or a point-set containing the variables under study.

This Object must contain at least two properties.

• Variable 1 The variable property listed in the Object above. This variable is associated with the X axis.

• Clipping Values for Variable 1 All values strictly less than “Min” and strictly greater than “Max” are ignored, and any change of “Min” and “Max” will affect the statistics calculation. The user can go back to the default setting by clicking

“Reset”. If Variable 1 has more than 10 000 data, then the “Reset” button can be

4.2 The SGeMS EDA tools 89

2

1

3

Figure 4.9 Scatter plot interface [1]: Parameter Panel; [2]: Visualization Panel;

[3]: Statistics Panel

used to generate a new scatter plot with a re-sampled set of data pairs containing up to 10 000 data.

• Variable 2 The variable property listed in the upper Object. This variable is associated with the Y axis.

• Clipping Values for Variable 2 Remarks similar to those for Variable 1.

• Options The choice of visualizing the least square line fit in the scatter plot.

The slope and the intercept are given below check box “Show Least Square Fit”. This option is valid only when the two variables are displayed with the arithmetical scale.

5