"

SWK 3401.4-5.2 Independent Samples t-Test Exercise

INTRODUCTION

The purpose of this activity is to learn how to compute and evaluate thet-statistic and p-value associated with an independent-samples (two-tailed)t-test using Excel. The example comes from the readiness to change data of the Safe At home project. The research question concerns the probability of an association between the categorical (dichotomous) variable about clients entering batterer treatment voluntarily or non-voluntarily (court ordered to treatment) and the interval variable of the number of weeks since the intimate partner violence incident that led to program admission. Are the mean number of weeks for these two groups different? The null hypothesis becomes:

H0: The difference in mean number of weeks since the incident between clients entering batterer treatment voluntarily and those entering under court order is zero.

INSTRUCTIONS

  1. Open the data file called battertx indep samples t test start.xlsx. You will need to select enable editing to work with this file.
  2. Given the large number of steps we had to take in computing the one-sample t-test, you might expect this test to be even more complicated. Fortunately, there is only one main task involved in the calculation process! Unfortunately, our data file is not configured the way that Excel needs it configured to execute this analysis. We need to do a little reconfiguration work on the data file up front.
  3. The reconfiguration goal is to have the number of weeks (sincwk is the variable name) in one cluster for the clients who entered nonvoluntarily/through court order (the variable called entry is coded as 1 for this group) and in a separate cluster for clients who entered voluntarily (entry variable is coded as 2 for that group). This is most easily accomplished by first sorting the cases on the entry variable so that the “1s” and “2s” are separated into two clusters. This process begins with selecting all of the data in the file (select all in PCs is “Control A” or you can simply click and drag to include all of the data). Otherwise, the sort only happens in the one column and the other data do not rearrange with it—this is a major flaw in Excel to be aware of!
  4. Then, in the top menu bar, select the “data” tab. In the next level menu, select the “Sort” tab. This should open a dialogue box where you designate sorting by the “grouping” variable in our analysis—entry shows up when you click the down arrow next to “sort by” in the box. We want the data sorted on cell values (the entry variable 1s and 2s) and the order does not really matter, smallest to largest is fine. When you click on “ok” you will see that your data have been reorganized for you. The first id number is 20237 and entry value is 1 for that case and all the cases you can see on your screen; before you did this sort step, the first id number was 20236 and the entry value was 2 for that case. If you scroll down through the data, you will see that line #309 is where the split happens between the two groups (cells F2 through F308 are the group of 1s on the entry variable, cells F309 through F370 are the group of 2s on the entry variable).
  5. Now, you need to select and copy all the values in the “sincwk” column where the “entry” value is a 1. This is rows G2:G308. Then paste it into an empty column—P would be a good choice, starting with P2. Then, place a label at the top of this new column—nonvol is a decent choice for cell P1.
  6. Repeat this process with all the “sincwk” values from G309:G370 to pick up values where the “entry” value is 2. Paste it into another empty column (Q is a good choice) starting with row 2 (Q2) and label this column something like vol in cell Q1. Completing these steps means you have now created 2 new data columns by copying/pasting the data of interest for one variable (sincwk) into two new “variables” using Excel logic—these are not really “variables” the way we have learned about them, they are two categories of data on one dichotomous variable. But this is the logic/language that Excel has adopted, so we’ll have to live with it and not let it confuse us.
  7. Now that the data are properly configured, the next step is to click on “data” tab in the menu bar and double-click on “data analysis” in the upper right corner of the top menu bar. This opens the menu of statistical analyses available—the event you have been looking forward to completing!
  8. Scroll down the list and select the one called t-test: two-sample assuming unequal variances. When you click “ok” you see a dialogue box with information to be filled in about your analysis. The top two blanks are for designating where your input data are located—for “variable 1” (the sincwk values for the nonvoluntary group) and variable 2 (the sincwk values for the voluntary group). But, since we want to have the column labels included (see selecting “labels” in step 7 below), we actually use the range for the first group (called “variable 1” in Excel) as P1:P308 and the range for the second group (called “variable 2” in Excel) as Q1:Q63. The actual syntax for these two “variables” is:

P1:P308 and Q1:Q63  or it can be $P$1:$P$308 and  $Q$1:$Q$63   in the two blanks.

  1. Filling in the remaining information for the dialogue box, we set the hypothesized mean difference to match our null hypothesis: 0. And we want to select labels (what we put into cells P1 and Q1). The alpha default in Excel is what we would have chosen: α=.05 and we want it to present the results in a section of our worksheet—have Output Range checked  and maybe have it start in cell R2. When you click on “ok” results of the analysis should appear in your open worksheet.
  2. Let’s assess the results of this independent samples t-test. The mean number of weeks for the nonvoluntary entry group is higher than the mean for the voluntary entry group (M=40.54 and 36.52 with rounding). This difference may or may not be statistically significant—we fail to reject the null hypothesis of no difference because the t-statistic is less than the critical value for a two-tailed test: t(76)=-0.41 (with rounding) and the two-tailed critical value from the t-distribution=1.99. If you prefer, we fail to reject because the p-value of 0.68 was greater than our α=.05 criterion.
  3. If you obtained these results (feel free to compare your output to the output in the file named battertx indep samples t test finish.xlsx), and drew the same conclusions, you have successfully completed this analysis activity! 

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Social Work 3401/3402 Workbook Copyright © by Dr. Audrey Begun is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.