Now, let's evaluate third variables as potential moderators in the context of chi-squared test of independence. For this, we're gonna return to our original SAS program using the NESARC data and asking the question, is smoking associated with nicotine dependence? We're going to create another smoking variable for this purpose, reflecting how many cigarettes each young adult smoker smokes per day. 0 will indicate non-daily smokers. 3 indicates those smoking 1 to 5 cigarettes per day. 8 indicates 6 to 10 cigarettes per day. 13 indicates 11 to 15 cigarettes per day. 18 indicates 16 to 20 cigarettes per day, and 37 indicates greater than 20 cigarettes per day. Now we're gonna scroll down to the bottom of the program, and request a chi-square test of independence, examining the association between nicotine dependence in the past 12 months and this new smoking variable, called USQUAN. As we can see from the large chi-square value and significant P value, smoking quantity and nicotine dependence are significantly associated. In examining the column percents here of each smoking group with nicotine dependence, we see generally higher rates of nicotine dependence among groups that smoke more. So among non-daily smokers, only 25% are nicotine dependent. Among those smoking one to five cigarettes per day, indicated by a code of three, nearly 50% meet criteria for nicotine dependence. Those numbers increase with higher levels of smoking. To graph these proportions with code, we use the following, PROC GCHART; VBAR USQUAN/discrete type=mean SUMVAR=TAB12MDX;. We ask for the average, which will give us the proportion of individuals with nicotine dependence, given that this variable is coded zero and one. First, we're going to quickly review what this code is telling SAS. Just like our PROC GCHART code in univariate graphing, VBAR requests a vertical bar chart. Categorical explanatory variable USQUAN/discrete tells SAS that we want levels of our categorical explanatory variable to be represented on the x-axis. The rest of the code in this statement provides instructions to SAS for how the response variable should be represented on the y-axis. Specifically, type=mean requests a calculated average, and SUMVAR is short for summary variable. So we're asking for the response variable TAB12MDX to be displayed as a mean on the y-axis. And this gives us the graphic representation of this positive linear relationship. As smoking quantity increases, so does the proportion of individuals with nicotine dependence. This finding is accurate with regard to the larger population of young adult smokers. Though might a third variable moderate the relationship between smoking quantity and nicotine? Put another way, might there be a statistical interaction between a third variable in smoking behavior and predicting our response variable, nicotine dependence? We're going to evaluate major depressive disorder as the third variable. Our question will be, does Major Depression affect either the strength or the direction of the relationship between smoking and nicotine dependence? Put another way, might a third variable moderate the relationship between smoking and nicotine dependence? Is smoking related to nicotine dependence for each level of this third variable, that is, for those with major depression and those without major depression? Similar to our anova example, syntax to be added to the PROC FREQ code is circled here in red. We need to first sort the data, according to the categorical third variable, then include a bistatement, telling SAS to run a chi-square for each level of the third variable separately. The specific syntax for this example is shown here, PROC SORT; BY MAJORDEPLIFE; PROC FREQ; TABLES TAB12MDX*USQUAN/CHISQ; BY MAJORDEPLIFE;. When this syntax is added to the SAS program, here are the results. You can see the cross tabs or cross tabulation table, looking at usual quantity by tobacco dependence in the past 12 months. First, for major depression equal to 0, which is those without major depression, the chi-square value is large and the P-value is quite small. In addition, the column percents reveal what seems to be a positive linear relationship with percentages of nicotine dependency increasing between lower levels of smoking and higher levels of smoking. So we can say that this is a statistically significant relationship for those without major depression. For those with major depression, we find a large chi-square value and small P value, which is statistically significant. These column percents also reveal what seems to be a positive linear relationship with percentages of nicotine dependence increasing between lower levels of smoking and higher levels of smoking. Using a line graph to examine the rates of nicotine dependence by different levels of smoking, it seems that both the direction and size of the relationship is similar between smoking and nicotine dependence for those with major depression and for those without. Although, those with major depression show higher rates of nicotine dependence at every level of smoking quantity. In this case, we would say a diagnosis of major depression does not moderate the relationship between smoking and nicotine dependence. For both young adult smokers with major depression and those without, higher levels of smoking behavior is associated with higher rates of nicotine dependence.