• No results found

When you check a numerical variable in the field list, it is added to the Values area

PREDICTORS OF SUCCESSFUL MOVIES

2. When you check a numerical variable in the field list, it is added to the Values area

Starting with Excel 2007, there are three different layouts for pivot tables, but the differences are relatively minor. Ultimately, it is a matter of taste. Figure 3.37 Sum of TotalCost by Time and Region (Tabular Layout)

This is exactly what happens when you check Time, Region, and TotalCost. However, this is just the beginning. With very little work, you can do a lot more. Some of the possi- bilities are explained in the remainder of this example.

First, however, we discuss the new look of pivot tables in Excel 2007. Notice that the pivot table in Figure 3.36 has both row fields, Time and Region, in column A. This wasn’t possible in old-style pivot tables, where the two row fields would have been in separate columns. Microsoft decided to offer this new layout because of its clean, streamlined look. In fact, you can now choose from three layouts: Compact, Outline, or Tabular. These are available from the Report Layout dropdown list on the Design ribbon. When you create a pivot table (in an .xlsx file), you get the compact layout by default. If you would rather have the tabular or outline layout, it is easy to switch to them. In particular, the tabular lay- out, shown in Figure 3.37, is closer to what was used in previous versions of Excel. (Outline layout, not shown here, is very similar to tabular layout except for the placement of its subtotals.)

One significant advantage to using tabular (or outline) layout instead of compact lay- out is that you can see which fields are in the row and column areas. Take another look at the pivot table in Figure 3.36. It is pretty obvious that categories such as afternoon and morning have to do with time of day and that categories such as Midwest and South have to do with region of country. However, there are no labels that explicitly name the row fields. In contrast, the tabular layout in Figure 3.37 names them explicitly. Still, you can choose the layout you prefer.

Hiding Categories (Filtering)

The pivot table in Figure 3.36 shows all times of day for all regions, but this is not necessary. You can filter out any of the times or regions you don’t want to see. To understand how this works, make sure the Options ribbon is visible. In the Active Field group, you will notice that one of the fields is designated as the active field. The active field corresponds to the location of your cursor. If your cursor is on a Time category, such as Evening, then Time is the active field. If your cursor is on a Region category such as NorthEast, then Region is the active field. If your cursor is on any of the numbers, then Sum of TotalCost is the active field.

Once you understand the active field concept, then the way Excel implements filtering makes sense. If Time is the active field and you click on the Row Labels dropdown arrow, you see the dialog box in Figure 3.38. To see data only for Afternoon and Morning, for example, uncheck the Select All item and then check the Afternoon and Morning items. Similarly, if Region is the active field and you click on the Row Labels dropdown arrow, you can check which regions you want to filter on. (If you are in tabular layout, it is more straight- forward, because each row field then has its own dropdown list.) For example, the pivot table in Figure 3.39 is obtained by filtering out the Evening and NorthEast categories. Note how the filter symbols replace the arrows in row 3 to indicate that some categories have been fil- tered out. Also, note that the updated subtotals for Morning and Afternoon and the updated grand total for all categories do not include the hidden categories.10

3.5 Pivot Tables 121

Figure 3.38 Filtering on Time

Figure 3.39 Pivot Table with Hidden Categories

10You have probably noticed that the dialog box in Figure 3.38 is exactly like the one for Excel tables discussed

in the previous chapter. This is no accident. You already learned how to filter tables, so there is nothing new to learn for filtering pivot tables.

Sorting on Values or Categories

It is easy to sort in a pivot table, either by the numbers in the Values area of the table or by the labels in a Row or Column field. To sort by the numbers in the Values area, right-click on any number and choose the Sort item. If a simple A-Z or Z-A sort isn’t enough, you can use the More Sort Options item. For example, this allows you to sort on the column of numbers that contains the selected cell or on the row of numbers that contains this cell.

To sort on the labels of a Row or Column field, you can again right-click on any of the categories such as Morning and select Sort. Alternatively, you can click on the dropdown arrow for the field, such as Time in Figure 3.39, and you will get the dialog box in Figure 3.38 that allows both sorting and filtering. However, be aware that sorting on labels is always in alphabetical or reverse alphabetical order. This is not always what you want. For example, suppose you want the natural sort order Morning, Afternoon, Evening. This isn’t the A-Z or Z-A order, but it is still possible to sort manually. The trick is to select the cell of some label such as Morning and place the cursor on the border of the cell so that it becomes a four-sided arrow. Then you can drag the label up or down, or to the left or right. It takes a little practice, but it isn’t difficult. ■

Changing Locations of Fields (Pivoting)

Starting with the pivot table in Figure 3.39, you can choose where to place either Time or Region; it does not have to be in the Row area. To place the Region variable in the Column area, for example, drag the Region button from the Row Labels area of the Field List window to the Column Labels area. The pivot table changes automatically, as shown in Figure 3.40. (We removed the filters on Time and Region.)

Figure 3.40 Placing Region in the Column Labels Area

Alternatively, you can categorize by a third field such as Day and locate it in a differ- ent area. As before, if you check Day in the Field List window, it goes to the Row area by default, but you can then drag it to another area. The pivot table in Figure 3.41 shows the result of placing Day in the Report Filter area. By clicking on the dropdown arrow in row 1, you can then show the pivot table for all days or any particular day. In fact, there is now a Show Multiple Items option you can check. (This option wasn’t available before Excel 2007.) We checked this option and then selected Friday and Saturday to obtain the pivot table in Figure 3.41. It reports data only for Fridays and Saturdays.

This ability to categorize by multiple fields and rearrange the fields as you like is a big reason why pivot tables are so powerful and useful—and easy to use.

Changing Field Settings

Depending on which field is the active field, you can change various settings in the Field Settings dialog box. You can get to this dialog box in at least two ways. First, there is a Field Setting button on the Options ribbon. Second, you can right-click on any of the pivot

Changing the locations of fields in pivot tables has always been easy, but the new user interface introduced in Excel 2007 makes it even easier.We favor dragging the fields to the various areas, but you can experiment with the various options.

table cells and select the Field Settings item. The field settings are particularly useful for fields in the Values area, as we now explain.

For now, right-click on any number in the pivot table in Figure 3.41 and select Value Field Settings to obtain the dialog box in Figure 3.42. This allows you to choose which way you want to summarize the TotalCost variable—by Sum, Average, Count, or several others. You can also click on the Number Format button to choose from the usual number formatting options, and you can click on the Show Values As tab to display the data in various ways (more on this later). If you choose Average and format as currency with two decimals, the resulting pivot table appears as in Figure 3.43. Now each number is the

3.5 Pivot Tables 123

Figure 3.41 Placing Day in the Report Filter Area and Filtering on Day

Figure 3.42 Value Field Settings Dialog Box

Figure 3.43 Pivot Table with Average of TotalCost The key to summa- rizing the data the way you want it summa- rized is the Value Field Settings dialog box. Get used to it because you will use it often.

average of TotalCost for all orders in its category combination. For example, the average of TotalCost for all Friday and Saturday morning orders in the South is $107.69, and the average of all Friday and Saturday orders in the South is $109.10.

Pivot Charts

It is easy to accompany pivot tables with pivot charts. These charts are not just typical Excel charts; they adapt automatically to the underlying pivot table. If you make a change to the pivot table, such as pivoting the Row and Column fields, the pivot chart makes the same change automatically. To create a pivot chart, click anywhere inside the pivot table, select the PivotChart button on the Options ribbon (see Figure 3.31), and select a chart type. That’s all there is to it. The resulting pivot chart (using the default column chart option) for the pivot table in Figure 3.43 appears in Figure 3.44. If you decide to pivot the Row and Column fields, the pivot chart changes automatically, as shown in Figure 3.45. Note that the categories on the horizontal axis are always based on the row field, and the categories in the legend are always based on the Column field.

Figure 3.44 Pivot Chart Based on Pivot Table

Figure 3.45 Pivot Chart after Pivoting Row and Column Fields Pivot charts are a great extension of pivot tables.They not only “tell the story” graphically, but they update automatically when you rearrange the pivot table.

Note that when you activate a pivot chart, the PivotTable Tools “super tab” changes to PivotChart Tools. This super tab includes four ribbons for manipulating pivot charts: Design, Layout, Format, and Analyze (see Figure 3.46). There is not enough space here to discuss the many options on these ribbons, but they are intuitive and easy to use. As usual, don’t be afraid to experiment.

Multiple Variables in the Data Area

More than a single variable can be placed in the Values area. In addition, a given variable in the Values area can be summarized by more than one summarizing function. This can make for a rather busy pivot table, so we indicate our favorite way of doing it. Starting with the pivot table in Figure 3.43, drag the TotalCost item in the top of the Field List window (the item that is already checked) to the Values area. The bottom half of the Field List window should now appear as in Figure 3.47, and the pivot table should now appear as in Figure 3.48. Note in particular the Values button in the Column Labels area. This button controls the placement of the data in the pivot table. You have a number of options for this button: (1) leave it where it is, (2) drag it above the Time button, (3) drag it to the Row Labels area, below the Region button, or (4) drag it to the Row Labels area, above the

3.5 Pivot Tables 125

Figure 3.46 PivotChart Tools Ribbons

Figure 3.47 Field List Window with Two Values Fields

Region button. You can experiment with these options, but we tend to prefer option (2), which leads to the pivot table in Figure 3.49.

In a similar manner, you can experiment with the buttons in the Values area. However, the effect here is less striking. If you drag the Sum of TotalCost button above the Average of TotalCost button in the field list, the effect is simply to switch the ordering of these sum- maries in the pivot table, as shown in Figure 3.50.

Figure 3.49 Rearranged Pivot Table with Two Values Fields

Figure 3.50 Another Rearrangement of the Pivot Table with Two Values Fields

Summarizing by Count

The variable in the Values area, whatever it is, can be summarized by the Count function. This is useful when you want to know, for example, how many of the orders were placed by females in the South. When summarizing by Count, the key is to understand that the actual variable placed in the Values area is irrelevant, so long as you summarize it by the Count function. To illustrate, start with the pivot table in Figure 3.50, where TotalCost is summarized with the Sum function. Now right-click on any number in the pivot table, select Value Field Settings, and select the Count function (see Figure 3.51). The default Custom Name you will see in this dialog box, Count of TotalCost, is misleading, because TotalCost has nothing to do with the counts obtained. Therefore, we like to change this Custom Name label to Count, as shown in the figure. The resulting pivot table, with values formatted as numbers with zero decimals, appears in Figure 3.52. For example, 27 of the 400 orders were placed in the morning in the South, and 115 of the 400 orders were placed in the NorthEast. (Do you now see why the counts have nothing to do with TotalCost?) This type of pivot table, with counts for various categories, is the same

crosstabs that we discussed in Section 3.2. However, it has now been created much more

When data are summarized by counts, there are a number of ways they can be displayed. The pivot table in Figure 3.52 shows “raw counts.” Depending on the type of information you want, it might be more useful to display the counts as percentages. Three particular options are typically chosen: as percentages of total, as percentages of row, and as percentages of column. When shown as percentages of total, the percentages in the table sum to 100%; when shown as percentages of row, the percentages in each row sum to 100%; when shown as percentages of column, the percentages in each column sum to 100%. Each of these options can be useful, depending on the question you are trying to answer. For example, if you want to know whether the daily pattern of orders varies from region to region, showing the counts as percentages of column is useful so that you can compare columns. But if you want to see whether the regional ordering pattern varies by time of day, showing the counts as percentages of row is useful so that you can compare rows.

3.5 Pivot Tables 127

Figure 3.51 Field Settings Dialog Box with Count Selected

Figure 3.52 Pivot Table with Counts.

To display the counts as percentages of some type, display the Value Field Settings dialog box (remember how?), select the Show Values As tab, and select the appropriate option (see Figure 3.53). The resulting pivot table and corresponding pivot chart appear in Figure 3.54. As you can see, the pattern of regional orders varies somewhat by time of day.

Counts can be displayed in a number of ways.You should choose the way that best answers the question you are asking.

Sometimes it is useful to see the raw counts and the percentages. This can be done easily by dragging any variable to the Data area, summarizing it by Count, and displaying it as “Normal.” Figure 3.55 shows one possibility, where we have changed the custom names of the two Count variables to make them more meaningful. Alternatively, the counts and percentages could be shown in two separate pivot tables.

Figure 3.53 Value Field Settings Dialog Box with “Show Values As” Options

Figure 3.54 Pivot Table and Pivot Chart with Counts As Percentages of Rows

Grouping

Finally, categories in a Row or Column variable can be grouped. This is especially useful when a Row or Column variable has many distinct values. Because a pivot table creates a row or column for each distinct value, the results can be unwieldy. We present two possibilities. First, suppose you want to break Sum of TotalCost down by Date. Starting with a blank pivot table, check both Date and TotalCost in the pivot table field list window. This creates a separate row for each distinct date in the data set—112 separate dates. This is too much detail, so it is useful to group on the Date variable. To do so, right-click on any date in column A and select the Group item. (Group options are also available on the Options ribbon.) Accept the default selections in the Grouping dialog box (see Figure 3.56) to obtain the pivot table in Figure 3.57.

3.5 Pivot Tables 129

Figure 3.56 Grouping Dialog Box

Figure 3.57 Pivot Table after Grouping by Month

Pivot Table Tip Suppose you have multiple years of data and you would like a monthly grouping such as January 2007 through December 2009. If you simply select Months as in Figure 3.56, all of the Januaries, for example, will be lumped together. The trick is to select both Months and Years in the dialog box.

As a second possibility for grouping, suppose you want to see how the average of TotalCost varies by the amount of the highest priced item in the order. Place TotalCost in the Data area, summarized by Average, and place HighItem in the Row area. Because HighItem has nearly 400 distinct values, the resulting pivot table is virtually worthless. Again, however, the trick is to group on the Row variable. This time there are no natural groupings as there are

for a date variable, so it is up to you to create the groupings. Excel provides a suggestion, as shown in Figure 3.58, but you can override it. For example, changing the bottom entry to 50 leads to the pivot table in Figure 3.59. Some experimentation is typically required to obtain the grouping that presents the results in the most appropriate way.

Figure 3.58 Grouping Dialog Box for a Non-Date Variable