• No results found

Using Data Analysis Tools

In document Excel Advanced (Page 35-70)

Excel 2010 provides a rich assortment of analysis tools that you can use to explore and interpret your data. While there are plenty of built-in tools in Excel, there are also additional software files than can be added to Excel to extend its functionality even more. These additional software tools are known as add-ins.

Add-ins can come in the form of a single task-oriented tool like Solver (which we will look at later in this lesson), or they can consist of several tools combined in a pack, like Excel’s Analysis ToolPak, which provides an assortment of tools for analyzing scientific, statistical, or

engineering related data. Some add-ins come packaged with the Microsoft Office system, while others may be provided by third party vendors.

In this lesson, we’ll have a look at a few of the what-if analysis tools that are built into Excel 2010, like goal seek and one and two input data tables. We’ll also look at add-ins and how to enable them. We’ll also take a close look at the Solver data analysis add-in.

Enabling Data Analysis Tools

As mentioned above, there are some interesting data analysis tools provided as add-ins for Excel, such as the Analysis ToolPak and the Solver. In a general sense, add-ins can be described as small software modules or files that extend the functionality of the larger Office applications.

The important thing to know is that these are not enabled in the Excel program when it is installed, as they often provide more obscure or mathematically advanced functionality than is required in a typical business environment. You must enable them manually from within Excel.

The first step in enabling add-Ins is to navigate to the add-in management section of the Excel Options window. To start, click File → Options:

When the Excel Options window opens, you will see a panel on the left side with a series of categories. Click Add-Ins:

When you choose this option, the Excel Options window should look something like this:

In the main viewing area, you will see a large scrollable list called Add-ins. This list is broken down into the following sections: Active Application Add-ins, Inactive Application Add-ins, Document Related Add-ins, and Disabled Application Add-ins. At the top of this list are the headings Name, Location, and Type. The corresponding name, location path, and add-in type, for each add-in are grouped in the list according to these headings.

If you click on any add-in in this list you will see the name of the add-in, the publisher, the location (file directory), and a brief description of the add-in in the area beneath the list. Here’s an example:

If you click the Manage menu near the bottom of the window, you will see the following options:

Here you can see two types of add-ins at the top of the list: COM Add-ins and Excel Add-ins.

COM stands for component object model, a Microsoft technology that helps software components (like Microsoft Office programs) to communicate. For example, a form of COM technology is used to allow Word documents to link to data in Excel spreadsheets. Often a single COM add-in will be available for a number of Office applications.

For the purposes of this lesson, we are interested in the add-ins specifically for Excel. To manage the Excel add-ins, select it in the list, and click Go.

When you click Go, an Add-Ins dialog will appear and show a list of the available Excel add-ins:

If an add-in is enabled, it will have a checkmark next to its name in the list. In the image above, no Excel add-ins are enabled yet. If you want to see a brief description of an add-in, click it in the list. In the example above, the Solver add-in is selected and a description of the Solver appears near the bottom of the dialog.

To enable a particular add-in, click to place a checkmark in the appropriate checkbox and click OK. To enable the Solver add-in, for example, simply place a checkmark next to it the Add-Ins dialog and then click OK.

If you see a dialog stating that the add-in you are enabling is not currently installed on your computer, simply click the Yes button in the dialog to install it.

To make sure the Solver add-in is enabled, click the Data tab, and then look for Solver.

Similarly, if you check the box corresponding to the Analysis ToolPak, and then click OK, it will be enabled in Excel as well.

In the following image, you can see that the Analysis option is now available on the Data tab.

An Overview of Excel’s Analysis Tools

As mentioned near the beginning of this lesson, the Analysis ToolPak add-in provides an assortment of tools for analyzing scientific, statistical, or engineering related data. Many of these features are for scientific or statistical data analysis, so a comprehensive exploration of the use of these tools is beyond the scope of this manual. That being said, here is a quick and gentle overview of some of the Analysis ToolPak features.

To enable the Analysis ToolPak, make sure there is a check mark in the appropriate checkbox in the Add-Ins dialog:

If the Analysis ToolPak is enabled, there will now be a corresponding option (Data Analysis button) for it on the Data tab:

If you click the Data Analysis button on the Data tab, a dialog with a scrollable list of tools will appear.

The tools available in the list are intended for quite specific and advanced data analysis techniques. To start one of the tools, simply click it in the list to select it and then click OK.

The first tool listed is called “Anova: Single factor.” (Anova is short for analysis of variance.) This tool can be used to examine if the means (averages) of samples (groups of experimental data from a population) are different in a significant way. Tests like these are often used to examine the significance of the results in scientific studies.

Generally in a study, an independent variable will be changed by the experimenters to see if it has a result on the dependent variable. This is often done by comparing the results for different groups of data (samples), resulting from tests at different levels of the independent variable.

For basic experiments with only two samples, you could use a basic test like a T test (also included in the Analysis ToolPak) to compare the averages of the samples. For more complex experiment (with more than two samples), an Anova test could be used.

A single factor Anova can be used to compare multiple samples, based on a single independent variable.

A two factor Anova test can be used to compare multiple samples, based on two independent variables

The next item in the list of analysis tools is a Correlation test.

It is often useful to know if two variables are related in some way, and a correlation test can help to determine this. Think of the word correlation as “co-relation.” If there is a strong correlation between two sets of data, it generally means that they trend or change in concert with each other. If there is a significant correlation in your data, you may be able to make predictions based on this relationship.

For example, if you sell sunglasses, you may be curious as to the relationship between the weather (amount of sun or temperature) and the sales of your product. Here is an example worksheet.

To invoke the Data Analysis dialog, click Data → Data Analysis. When the dialog appears, select the Correlation option and click OK.

When you click OK, a Correlation dialog will appear on your Excel screen. In this dialog, you can

In the following example, the range of data to be analyzed is B1 to C14, with the data grouped in columns and column labels included. Furthermore, the results will be displayed on a new worksheet.

Once you are ready, just click OK to generate the analysis results. The following image shows the results as displayed in a new worksheet.

The important value to take note of is in cell B3. This is the correlation value. This value will be between -1 and +1. If the value is +1 it means that there is a perfect correlation. In other words, the two sets of data rise in perfect concert. If the correlation value is a -1, it means that just as one set of data rises, the other set of data shrinks. A value of 0 would indicate no relationship whatsoever. As you can see in this example, a value of .915834314 (almost 1) indicates a very strong positive relationship between the hours of sunlight per day and the sales of pairs of sunglasses per day.

Another interesting tool in the Analysis ToolPak is the Random Number Generator.

You can use random number generation to simulate data or random behavior. For example, if you wanted to simulate a series of coin tosses for a statistical experiment, you would select Random Number Generation from the list in the Analysis Tools dialog and click OK. This will display the Random Number Generation dialog.

To simulate a coin toss, enter the number of variables (1 for one coin) and then enter the number of random numbers (the amount of flips of the coin) in the appropriate fields. You must also choose the correct distribution (which is the Bernoulli distribution for the case of flipping a coin) from the drop list provided. You must also enter a p value (a value between 0 and 1), which represents the probability of getting a heads or a tails. Since we want to simulate a fair coin we will choose .5 (50/50 chance) for the p value. For the purposes of this example, it is not necessary to enter anything in the Random Seed box. Finally, you can choose to put the

The following image shows the results of the coin toss in cells A1 through to A10. Think of a 1 as a head and a 0 as a tail:

Probably the most basic tool in the Analysis ToolPak is the Rank and Percentile feature.

You can use this feature to create a percentile rank of your chosen data values. For example, here is a batch of random data that could represent scores on a hypothetical test.

To rank these scores we would select “Rank and Percentile” from the list in the Data Analysis dialog and click OK. This action will display the Rank and Percentile dialog:

example, the New Worksheet radio button is selected for the destination. When you are ready, just click OK to proceed.

In the results, you can see separate columns labeled Point, Column1, Rank, and Percent. The Point column provides values for the position of the given test score in the original data column. Column1 lists the ranked test scores, the Rank column shows the rank of the

corresponding test score, and Percent column shows the percentage of scores that are below that score.

As a final note on the tools in the Analysis ToolPak, let’s discuss how to get a bit of help with a given tool if you need it.

If you see a tool in the Data Analysis dialog that you would like to know more about, click the Help button on the right side of the dialog:

Then an Excel help window will open with a list of items corresponding to the tools in the Data Analysis dialog. To get help or information on an item in this list, click on the name of the item and it will expand to show a brief description of the given item and what it is used for:

By default, Excel will try to connect to the Internet to gather available information from Office.com. If you do not have an Internet connection, help will retrieve information that is stored locally on your computer.

An Overview of the Available Add-In Tools

As mentioned earlier, the available Excel 2010 add-ins are:

Analysis ToolPak

Analysis ToolPak – VBA

Euro Currency

Solver Add-in

Now that we have looked at some of the tools in the Analysis ToolPak, it is time to turn our attention to the remaining Excel add-ins.

The Analysis ToolPak – VBA add-in can be enabled just like any other Excel add-in, but you will not see an option for it in any of the tabs.

The Analysis ToolPak-VBA add-in should not be confused with the Analysis ToolPak that we just looked at. VBA stands for Visual Basic for Applications. It is a programming tool for building functions and automating tasks in Microsoft Office applications. The ToolPak-VBA add-in provides the user with the ability to call (use) the functions in the Analysis ToolPak from within VBA programs (like macros). This is not the same as using the ToolPak functions manually from within your worksheet, which is why the ToolPak-VBA add-in doesn’t appear as on the Data tab.

It is primarily for behind-the-scenes work by providing VBA programmers with more functionality. If you only intend on using the ToolPak features manually as shown in the previous pages, there is no need to install the ToolPak VBA add-in.

Next is the Euro Currency Tools add-in.

When this add-in is enabled, currency commands will appear on the Formulas tab:

These tools can be used to convert various European currencies to Euros or other European currencies. Take the following worksheet as an example.

Here we have a column of values that represent Austrian Schillings. To convert this column of Schillings to Euros, simply select the range of values, and then choose Formulas → Euro Conversion:

At this point, a Euro Conversion dialog box will appear. In the dialog, you must specify the range of source data (A2:A13 in this case) and the destination range for the conversion (B2:B13).

Next, you can use the drop lists provided to specify the source currency (Schillings) and the destination currency (Euros).

Once everything is ready, click OK in the dialog to convert the currency. Here is the result:

You can also format cells so that Excel recognizes the cell values as Euro currency, much the same as if you formatted the cells as standard ($) currency. To do this, start by selecting a range of cell values.

Next, choose Formulas → Euro Formatting.

Here is the result:

Just as with standard currency formatting, if you change the value in the cell, the Euro currency formatting still remains with the cell.

As you can see, this add-in isn’t really for data analysis, but it is worth mentioning these tools in this overview as they can be useful if you work with European currencies.

The final Excel add-in is Solver:

When the Solver is enabled, it can be accessed by a button on the Data tab:

Solver is a fairly advanced tool for examining complicated what-if scenarios. Essentially, it is used to minimize or maximize systems of equations based on a set of constraints. We won’t get into too much detail about it here because we will explore it in depth at the end of this lesson.

Using a One or Two Input Data Table

You can use Excel data tables to test how your formula results will vary when the data that the formula operates on changes. You do this by specifying a series of hypothetical values that Excel will use to evaluate the formulas with. This allows you to then view the results of the

evaluations and compare the results for the different data inputs. Data tables save you the trouble of entering a series of input values into a worksheet and recording each recalculation of the worksheet for each new input that you enter. When you use a data table, Excel will

substitute a range of values into the worksheet formulas for you, and tabulate the results so they can be viewed easily.

In Excel, you can create a single input data table or a two input data table. A single input table will substitute a range of values as a single variable in as many formulas as you like. With a two input data table, you can specify ranges for two input variables, but these input variables can only be applied to one formula.

The following example involves a hypothetical consulting firm. Our first goal is to examine the effect of the number of clients on the total profit, expenses, and income.

Currently, the firm has 10 clients. The values for Wages, Total Costs, Total Expenses, and Profit are all calculated by formulas dependent on the number of clients the firm can retain. For example, you might like to see what the profit formula and other formulas in the worksheet would produce for a varying number of clients. To manage this, we can use a single input data table.

Keep in mind that there are some strict rules you should follow when building your data table to help ensure that it works correctly. First, list the input values that you want to try in a row or column of adjacent cells. For this example, we’ll use a column of input values. The column will be named Hypothetical Clients and will hold different values for our variable (the number of clients).

In the row just above your input data (row 4 in this example), enter the cell references to the formulas that you want to evaluate. Make sure you enter the references starting one cell to the right of the column of input values. In this case, we want references to cells C5, C13, and C15 placed in the row just above our input values (row 4), and starting one cell over to the right from the input column (in column F).

The resulting worksheet looks like the following:

The input variables are in the cell range E5:E19. Cell F4 contains the reference =C5, cell G4 contains the reference =C13, and cell H4 contains the reference =C15. These references point to the Total Income, Total Expenses, and Profit formulas respectively.

It is a good idea to label your new columns appropriately, so you can clearly understand the data table results. In the preceding example, the new columns are labeled Total Income, Total Expenses, and Total Profit. Now, let’s have a closer look at the structure.

Notice that the formula references (in cells F4:H4) are in a row just above, and one cell to the right of the first input variable (in E5). The data table is now ready. The input variables are listed in the “Hypothetical Clients” column and the formula references are one row above and one cell to the right with respect to the first input value. Also, all of the elements in the data table are clearly identified.

The next step is to select the full range of cells from the data table formed by the input

The next step is to select the full range of cells from the data table formed by the input

In document Excel Advanced (Page 35-70)

Related documents