SWK 3401.4-5.1 One-Sample t-Test Exercise
INTRODUCTION
The purpose of this activity is to learn how to compute and evaluate thet-statistic and p-value associated with a one-sample (two-tailed) t-test. The example comes from the hypothetical lead levels data where the difference between lead values collected at time 1 and values collected at time 2 have been calculated as a new variable: t1t2diff. (This was computed by entering the syntax =C2-G2 into the cell I2. Then, the cell was copied and pasted into the rest of the cells in the column, where Excel was smart enough to compute the difference for each row.)
INSTRUCTIONS
- Open the file called lead levels data time 1 time 2 zero diff one sample t test start.xlsx. You will need to select “edit” mode to work with the file.
- The investigators wish to determine if the (hypothetical) children’s lead levels are rising, staying the same, or dropping. This means that they have a two-tailed hypothesis to test, and the null hypothesis might be:
H0: the mean difference in time 1 and time 2 lead levels is zero. - In order to perform this analysis, we need to know several pieces of information. We need to know the sample size (N), the sample mean (X̅), and the sample standard deviation (sd). From this, we can compute the degrees of freedom (N-1) and the square root of the sample size (√N). Plugging these values into the formula, we can determine the t-value and the p-value, then compare these to the critical value from a t-distribution table. Let’s see where these come from.
- Use the “count” command to determine the sample size. Click on an empty cell, perhaps L2. Enter in the syntax for counting the total number of children in the sample:
=Count(A2:A179)
The result in that cell should be 178. Now, because there are multiple steps involved, we want to start labeling what these computed values might be. So, in the cell to the left of the one with our count (K2), make a notation of what L2 is—perhaps use N to designate sample size. (N=178) - Next, click on an empty cell, perhaps L3. Enter the syntax for computing the mean of the difference values in the column called “I”. That syntax uses the word AVERAGE instead of MEAN, and this is what you would type into cell L3 or use the drop-down menu next to the ∑ in the top right menu bar:
=AVERAGE(I2:I179)
Then, in the cell to the left of the computation, type in a label—perhaps use Mean. The value for the mean should be: M= -0.010 in this sample (which is the same as -0.010). - Excel has a command for computing the standard deviation of a sample: STDEV.S (the one called STDEV.P is for a population). The syntax is similar to what we have been doing, with the range of cells following the command. In an empty cell, such as L4, type in the syntax request:
=STDEV.S(I2:I179)
Then, label it in the preceding empty, perhaps putting into cell K4 the nickname sd. The value that should show up in your computation cell (L4) should be 0.028. - The remaining information we need is that the degrees of freedom (df) is the sample size minus 1 (N-1), so we have 178-1 degrees of freedom, or df=177. We can make Excel do this computation by clicking on an empty cell (L5) and entering the command syntax: =L2-1 and labeling it df using cell K5.
- The standard value that we are comparing our mean to is 0 (no difference value). We can call this “delta” since the symbol Δ is difficult to type unless you know how to insert symbols.
- The formula we are going to use to compute the t-value remains what we learned in the coursebook:
t(df)= (X̅ – Δ)/[(s) * 1/√n)] - Now we can get Excel working on calculating the numerator of this equation: (X̅ – Δ), which is the syntax (try putting this in cell L7): =L3 – 0 or, we could type in =L3-L6 to get the numerator. Which, in this case, is just the mean since we subtracted zero.
- To calculate the denominator, we are going to have to know the square root of the sample size (√N). Excel can do this calculation for us!!! The syntax (try this in cell L8) is: =SQRT(L2) to get the square root of our sample size (N=178) that was entered into cell L2. The answer: 13.34 (with rounding). We need to divide this value into 1 (we needed 1/√N in our formula to multiply by the sd). So, we point Excel in this direction by clicking on an empty cell (try L10) and type in the syntax for the division command: =1/L8 to get the answer 0.075 (with rounding), and we label it using cell K10.
- This makes the denominator for computing our t-statistic value (sd)*(1/sqrtN), which again we can make Excel compute for us. The syntax that we can enter into cell L11 is: =L4*L10 to get the result that our denominator is 0.0021 (with rounding).
- Now we are ready to do the final computation, dividing the computed numerator by the computed denominator. Ask Excel (in cell L12) to divide the numerator by the denominator: =L7/L11 to get the resulting t-value of -4.59 (with rounding). This would be written as: t(177)= -4.59.
- To calculate the p-value for this statistic using Excel we need two pieces of information: the absolute value of the t-statistic and degrees of freedom. The absolute value, as you may recall, means the distance from zero, whether it is above or below zero (positive or negative number). In our case, the absolute value of our t-statistic is 4.59 and the Excel syntax for finding this is: ABS(L12) since our t-value was computed in cell L12. Placing this syntax into cell L13 should result in the value 4.59 (with rounding).
- The syntax command for Excel to compute the p-value for this absolute value of the t-statistic at the study’s degrees of freedom is as follows (placing it in cell L15): =T.DIST.2T(L13,L5) which is asking Excel for the 2-tailed t-distribution probability for the absolute value of the t-statistic we computed (L13=4.59) at the computed degrees of freedom (L5=177).Placing this syntax into cell L15 should result in a p-value of 8.35528E-06.
- What in the world could that mean? A number with the letter E embedded in it? This is referring to scientific notation, where the base value (8.355) is multiplied by 10 to the exponent value (-6). Because the exponent is negative, this is the same as putting 5 zeros before the digits and after the decimal place. In other words, it is a really, really, really tiny value! (if the exponent were positive, it would mean putting zeros after the digts and before the decimal, making it a really large number.) In APA format, we would write this as p<.001 instead of its actual value: p=.00000835528.
- To finish our exercise, let’s look back at the original null hypothesis:
H0: the mean of the difference values from time 1 to time 2 is equal to zero.
Since the computed p-value is less than our criterion of α=.05, we reject the null hypothesis and conclude that the mean of the difference values is unequal to zero. - Additionally, we can consult the t-distribution table for df=177 at 95% confidence as a two-tailed test. But, no one carries these tables around with them all the time. Again, we can make Excel do the work for us. The syntax for finding the critical value for comparing with our computed t-value is: =T.INV.2T(probability, deg. Freedom). The probability is our α of .05 (remember α=1-p, since α and probability are the inverse of each other). So, in our example, the syntax entered into a new empty cell (try cell L18) is: =T.INV.2T(.05,L5) where we want the inverse for t-values on a two-tailed distribution at α=.05 for the degrees of freedom in cell L5 (df=177). Excel tells us that the critical criterion value for comparing our computed t-statistic is 1.97 (with rounding). Since the absolute value of our t-value (4.59) is greater than our critical value of 1.97, the decision is to reject the null hypothesis. Not surprising, since we always reach the same conclusion whether we use the p<.05 or t>criterion decision rule—they are directly related.
- Feel free to compare your work with the file called lead-levels-data-time-1-time-2-zero-diff-one-sample-t-test-finish revised.xlsx. If you obtained these same results in your work with Excel, and drew the same conclusions, you have successfully completed this very challenging learning activity!