4 GENERATION OF SIMULATED DATA AND FORECASTS .1 Introduction
4.2 RDF1_XB - Generation of Random Exogenous Base Data
4.2.1 RDF1_XB Sheet 1: Generate Random Data
The sheet was designed to obtain random sample data from either Normal or Uniform populations with user specified parameters. Three panels are provided as a user interface:
1) An input panel to control the random data generation process; see Figure 4.1.
2) An input/output (I/O) panel for obtaining target statistics for the random data to be generated and to display the descriptive statistics for the data that were generated; see Figure 4.2.
3) An output panel displaying the random data that have been generated; see Figure 4.3.
Figure 4.1: Excel Application RDF1_XB - Sheet Generate Random Data - Panel 1 controlling random data generation and export
46
Figure 4.2: Excel Application RDF1_XB - Sheet Generate Random Data - Panel 2 (excerpt) - Distribution type and target and actual statistics of random data
Figure 4.3: Excel Application RDF1_XB - Sheet Generate Random Data - Panel 3 (excerpt) - Random data generated
Figure 4.1 illustrates an important application design convention that applied to all the Excel applications developed for this study, i.e. to provide in situ documentation for running the application as part of the user interface. As can be seen in Figure 4.1, clear instructions are provided to the user on how to obtain the required random data in five steps. Figure 4.1 also illustrates two other important application design conventions:
47
Light green areas for user input. To ensure that the integrity of the various
applications are not compromised, all ranges are password protected except for the cells that accept user input;
The use of Excel Forms Control buttons to invoke the applicable VBA macros that were developed for this study.
The five steps depicted in Figure 4.1 for controlling the random data generation process are:
Step 1: Excel’s built-in Data Validation tool is used to accept valid values specifying the subset of the 200 random datasets to be processed when the Generate Random Data button or Export Random Data button is clicked.
Step 2: For each of the datasets selected in Step 1 the user specifies the desired parameters in Panel 2 depicted in Figure 4.2. This consists of:
a) The distribution type selected from Excel Data Validation dropdown lists with options Normal and Uniform;
b) The sample size in the range 10 to 1000 also selected from Excel Data Validation dropdown lists;
c) The distribution parameters, i.e. the mean and standard deviation for Normal distributions and the minimum and maximum for Uniform distributions.
Step 3: Here the user can specify the target maximum error percentage margin for the sample mean and standard deviation of the random data to be generated. When generating a random sample its distribution statistics will invariably differ from the distribution parameters of the relevant parent population. For Normal distributions the initial sample data can be easily transformed by a linear transformation to obtain a sample with the target sample mean M and standard deviation s. If the statistics for the initial sample X are M and non-zero s then a sample X with the target M and s can be obtained as:
( ̅̅̅̅) where
This transformation cannot be used for Uniform distributions due to the fact that it may produce values that are not uniformly distributed within the interval defined by the user specified minimum and maximum and may even fall outside
48
the defined range. The problem is addressed with an iterative process in the VBA macro XB_GRD_Generate_Random_Data documented below.
Step 4: The message “Click button” is displayed if the required parameters for
generating new random data are specified, i.e. dataset(s) to process; distribution type, sample size, mean and standard deviation for Normal distributions,
minimum and maximum for Uniform distributions and a non-zero target maximum error range for mean and standard deviation values. If there are missing and/or invalid parameters, then a warning message shaded either yellow (missing) or red (invalid) is displayed with a request to first address the issue(s) before proceeding to generate random data. Clicking the button activates the VBA macro XB_GRD_Generate_Random_Data documented below.
Step 5: The final step is to export the random data generated for the dataset numbers specified in Step 1. A “Click button” message is displayed if there are no invalid or missing input data. Clicking the button activates the VBA macro
XB_GRD_Export_Random_Data documented below.
The second panel of RDF1_XB, depicted in Figure 4.2, provides for the specification for up to 200 datasets: the distribution type (Normal or Uniform); the sample size (10 to 1000); and either the mean and standard deviation for Normal distributions or the minimum and maximum for Uniform distributions. Excel’s Data Validation tool is used to ensure valid input: numeric values only; positive standard deviations; and minima less than maxima. Missing values are shaded yellow. Panel 2 displays the error percentages for the differences between the target and actual mean and standard deviation values as well as descriptive statistics for each random dataset appearing in Panel 3. Error percentages greater than the target maximum error margin are shaded red.
The two VBA macros linked to RDF1_XB sheet Generate Random Data are (code in Appendix 1):
1) Macro XB_GRD_Generate_Random_Data – This macro performs the following functions for each of the up to 200 datasets selected by the user in Panel 1:
1.1) It invokes the Excel Calculate routine to generate new random values for the random data generation formulas that are embedded in sheet Generate Random Data. Two different formulas are used to obtain random data:
49
i. Normal distribution random data are generated with the Excel function NORM.INV(p,m,s) which returns the inverse of the Normal cumulative distribution for the specified probability p, mean m and standard deviation s. Random p values are generated with the Excel function RAND() which returns random values between 0 and 1 from a continuous Uniform distribution, and m and s are the relevant user specified mean and standard deviation.
ii. Uniform distribution random data are generated with a formula
min + RAND()*(max – min) where min and max are the relevant user specified minimum and maximum.
1.2) It checks whether the differences between the target and actual values for both he mean and standard deviation are less than the user specified
maximum error margin. If the test fails it recalculates a new random dataset and repeat the process until either the test is successful or 100 iterations have been performed. Note that this iterative process is for Uniform distributions only because, as explained previously, for Normal distributions a linear transformation ensures that the generated data have the target sample mean and standard deviation.
1.3) It transfers the random data from where it is calculated in the sheet to the designated column for the dataset being processed.
The macro concludes with either a message that the random data were successfully generated or that it were generated but with some sampling errors greater than the user specified maximum error margin.
2) Macro XB_GRD_Export_Random_Data – This macro exports the up to 200 datasets selected by the user in Panel 1 as an Excel workbook with the same format as
depicted in Figure 4.3.