CHAPTER 2
For the most part we will assume that the reader is already familiar with Excel. We don’t have space here to provide a review of Excel. Instead we will touch on a few of the capabilities which are not as commonly used, but are essential for doing statistical analysis in Excel. We will focus on the most recent versions of Excel in the Windows environment, namely Excel 2007, 2010 and 2013, although most of what is written will also apply to earlier versions of Excel in Windows as well as Excel for the Macintosh (especially Excel 2011).
Excel works with files called workbooks. Each workbook contains one or more spreadsheets, called worksheets. Each worksheet consists of cells organized in a rectangular grid. The rows of a worksheet are labeled with a number and the columns are labeled with a letter or series of letters. The first row is labeled 1, the next 2 and so on. The first column is labeled A, the next B, etc. The column after Z is labeled AA, and then AB, AC, and so on. The column after AZ is BA, and then BB, BC, etc. The column after ZZ is AAA, and then AAB, AAC, etc.
Each worksheet in Excel 2007/2010/2013 can contain up to 1,048,578 rows and 16,384 columns (i.e. column A through XFD). Each cell can contain a number, text, truth value or a formula. Cells have both a value and an address. E.g., the cell with address B5 lies in the fifth row and second column.
A cell range consists of all the cells in a rectangular area on the worksheet. E.g., the range B6:C10 consists of the 10 cells in the rectangle whose opposite corners are the cells B6 and C10. The formula =SUM(B6:C10) has a value which is the sum of values of all the cells in the range B6:C10). Formulas can use either absolute addressing, relative addressing or both.
The rectangular grid is only one part of the Excel user interface. The entire interface is as depicted in Figure 1.

This is the layout used in Excel 2007. The layout in Excel 2010 and Excel 2013 is almost identical.
The Ribbon Tabs are the top level menu items. In Figure 1 these are Home, Insert, Page Layout, Formulas, etc. To access most capabilities in Excel you click on one of these ribbon tabs. For each tab a different ribbon will be displayed.
Each ribbon consists of a collection of Excel capabilities organized into groups corresponding to some ribbon tab. For example, the Home ribbon, displayed in Figure1, is organized into the Clipboard, Font, Alignment, Number, etc. groups. Each group consists of one or more icons corresponding to some capabilities in Excel. For example, to center the content of a cell in a worksheet, click on that cell and then click on the center icon
in the Alignment group on the Home ribbon.
We use the following abbreviation for this sequence of steps: Home > Alignment|Center. There are also shortcuts for some icons. E.g., to center the contents of a cell, you can click on that cell and then enter Ctrl-E.
Most worksheet formulas used in Excel return a single value that is assigned to the cell that contains the formula. Excel also allows you to define a formula that assigns values to a range of cells at the same time. These are called array formulas and they will be used quite often in the rest of this book.
Example: Calculate the revenues for each item in the worksheet of Figure 2.

Given that the revenue for each item is unit price times quantity, we can enter the formula =B4*C4 in cell D4 and then copy this formula into cells D5, D6 and D7 (e.g., by clicking on cell D4, pressing Ctrl-C, then highlighting the range D5:D7 and pressing Ctrl-V, or by highlighting the range D4:D7 and pressing Ctrl-D).
Another way to do this is via an array formula, using the following steps:
It is essential to press Ctrl-Shft-Enter (step 3) and not simply Enter as for an ordinary formula. Note that the formula that appears in the formula bar is {=B4:B7*C4:C7}. The curly brackets indicate that this is an array formula. If the range B4:B7 is given the name UnitPrice and C4:C7 is given the name Quantity, then the array formula can be entered as =UnitPrice*Quantity (step 2).
The array formula appears in all four cells in the range D4:D7. To make changes to the formula you must edit the entire range and not just one, two or three of these cells. Similarly you can’t copy or delete a part of the range, but must copy or delete the entire range. If you attempt to modify a part of the range you will receive an error message. If you get stuck and see a series of error messages you just need to press the escape key Esc to recover.
You can erase a range that contains an array formula by highlighting the entire range and pressing the Delete key. You can write over the array function replacing it by a value or another formula. The important thing is to use the entire range and not a part of the range.
Note too that you can also use array formulas such as {=SUM(B4:B7*C4:C7)}. This returns the value which is the sum of the revenues of the four types of equipment. Even though this formula returns a single value, and so may be placed in a single cell such as D8, it must be entered as an array formula (since the formula contains an embedded array formula). This means that you need to type =SUM(B4:B7*C4:C7) and then press Ctrl-Shft-Enter. If you forget to press Ctrl-Shft-Enter and only press Enter, you will get an error message.
A few of Excel’s built-in functions are array functions, where the output of the function is an array (i.e. a range). These functions are managed as described previously for array formulas.
Example: Change the data range in columns A and B of Figure 3 into an equivalent row range.

This can be accomplished by means of Excel’s TRANSPOSE array function using the following steps:
Note that the output range (step 1) must be of the right size. In this case, the input range is 6 rows by 2 columns, and so the output range must be 2 rows by 6 columns. As for array formulas, the formula bar contains the array formula enclosed in curly brackets. Once again it is important to press Ctrl-Shft-Enter.
Excel provides a number of data analysis tools, which are accessible via Data > Analysis|Data Analysis.
If this option is not visible you may need to first install Excel’s analysis tool pack. This is done by selecting Office Button > Excel Options > Add-Ins in Excel 2007 or File > Help|Options > Add-Ins in Excel 2010/2013, and clicking the Go button at the bottom of the window. Next you select the Analysis ToolPak option on the dialog box that appears and click the OK button. You will then be able to access the data analysis tools.
After selecting Data > Analysis|Data Analysis you will be presented with the dialog box in Figure 4.

You can now select any one of the following options useful for statistical analysis:
Each of these options represents a data analysis tool that will be described in this book. Now suppose by way of example that you choose Anova: Single Factor. You will now be presented with the dialog box shown in Figure 5.

The Input Range consists of the Excel range where the data elements to be analyzed are stored. Suppose, by way of example, this data consists of a 4 × 8 array representing 4 treatments as in Figure 6.

In this case you insert the range B2:E9 in the Input Range field (of the dialog box of Figure 5) and select the Columns radio button. If you had assigned a name (e.g., Study1) to the range B2:E9 then you could put this name in the Input Range field instead of B2:E9.
Alternatively you could insert B1:E9 in the Input Range field and check the Labels in First Row checkbox in the dialog box to indicate that you have included the column headings in the data range. Note that the participant numbers (in column A) are not used.
If the data were arranged where the treatments are listed by row instead of column, then you would select the Rows radio button and you could optionally select the Labels in First Column checkbox.
The Alpha value (which will be described later in the book) is set to 0.05 by default, although you can optionally change this to 0.01 or some other value.
You can now choose the New Worksheet Ply radio button from the Output options (and leave the data field blank). In this case a new worksheet will be created (in the tab prior to the current one) and the ANOVA report will be placed in this worksheet starting at cell A1. You can then copy the results to the current worksheet (or anywhere else you like).
Alternatively, you can choose the Output Range or New Workbook radio button, to put the report in some specific output range that you choose or in a new workbook.