load('data/nhanes500.RData')
load('data/vitc.RData')
load('data/asthma.RData')
Lab: Two-sample inference for proportions
With solutions
The goal of this lab is to learn how to perform two-sample inference for categorical data. This includes both binomial and multinomial data. Examples follow those given in lecture.
Two-way tables
For study data in which a values of categorical variable are recorded for two or more groups, analysis proceeds from a two-way table of the category counts for each group. Two-way tables are constructed by cross-tabulating the categorical variable with the group labels. In R, this is achieved using the table(...)
function with two arguments. For the data from the Vitamin C experiment discussed in lecture:
# two-way table
table(vitc$trt, vitc$out)
Cold NoCold
Placebo 335 76
VitC 302 105
By convention, the categories for the variable of interest are shown on the column dimension and the group labels are shown in the row dimension. In R, this means entering the grouping variable (vitc$trt
above) as the first argument to table(...)
.
To see groupwise point estimates of the category proportions, use prop.table(...)
:
# groupwise proportions
table(vitc$trt, vitc$out) |>
prop.table(margin = 1)
Cold NoCold
Placebo 0.8150852 0.1849148
VitC 0.7420147 0.2579853
The estimated chance of getting a cold without the supplement is 81.5%, compared with an estimated chance of 74.2% with the supplement.
Notice the margin = 1
argument – this specifies that the counts should be normalized using the row totals (as opposed to the column totals or grand total). It’s very important to specify this correctly – you should do a quick sanity check by hand to make sure the proportions returned match your expectation.
The asthma
data contains, for a random sample of adults, each subjects’ sex and whether they are asthmatic. Construct a two-way table of asthma prevalence by sex and compute the proportion of asthmatics for each group.
# two-way table
table(asthma$sex, asthma$asthma)
asthma no asthma
male 30 769
female 49 781
# groupwise proportions
table(asthma$sex, asthma$asthma) |>
prop.table(margin = 1)
asthma no asthma
male 0.03754693 0.96245307
female 0.05903614 0.94096386
30 men and 49 women in the dataset are asthmatic; this amounts to an estimated prevalence of 3.75% and 5.90% for men and women, respectively.
If you obtained different proportions, then either the margin is not specified correctly or the table orientation is reversed.
The construction of two-way tables proceeds identically for multinomial data and/or multiple groups. For the example comparing health perceptions between survey years from the NHANES data:
# two-way table for multinomial data
table(nhanes$surveyyr, nhanes$healthgen)
Excellent Vgood Good Fair Poor
2009_10 27 78 88 28 5
2011_12 20 84 89 25 6
# groupwise proportions
table(nhanes$surveyyr, nhanes$healthgen) |>
prop.table(margin = 1)
Excellent Vgood Good Fair Poor
2009_10 0.11946903 0.34513274 0.38938053 0.12389381 0.02212389
2011_12 0.08928571 0.37500000 0.39732143 0.11160714 0.02678571
An estimated 11.95% of U.S. adults perceive themselves in excellent health in 2009-2010; by comparison, an estimated 8.93% hold the same perception in 2011-2012.
Using the nhanes
data, estimate the proportions of men and women who perceive themselves as in excellent health.
# groupwise proportions
table(nhanes$gender, nhanes$healthgen) |>
prop.table(margin = 1)
Excellent Vgood Good Fair Poor
female 0.07327586 0.36206897 0.39655172 0.13793103 0.03017241
male 0.13761468 0.35779817 0.38990826 0.09633028 0.01834862
An estimated 7.33% of women perceive themselves in excellent health; by comparison, an estimated 13.76% of men hold the same perception.
When the grouping attribute is constrained by the study design (as in the Vitamin C experiment or when grouping NHANES data by survey year), it’s more natural to think of the table as a summary of univariate data for two or more independent samples. In this context, inferences should be described in terms of comparisons among groups. However, when the grouping attribute is determined by observational data (as in the asthma data or when grouping NHANES data by sex), it’s more natural to think of the two-way table as a multivariate summary of a single sample. In this context, inferences should be described in terms of “association” between the variables that define the columns and rows.
Two-sample inference for binomial data
Consider testing whether vitamin C supplements affect the probability of getting a cold. The corresponding hypotheses are:
\[ \begin{cases} H_0: &p_1 = p_2 \\ H_A: &p_1 \neq p_2 \end{cases} \] The subscripts distinguish the treatment groups (not the outcomes). As in the one-sample setting, a test can be constructed by:
- determining expected counts if \(p_1 = p_2 = p\) from the table margins (see here)
- measuring the divergence of observations from expectations using the \(\chi^2\) statistic
- comparing with a \(\chi^2_1\) model to determine a p-value
Implementation in R
is provided by prop.test(...)
:
# inference for a difference in binomial proportions
table(vitc$trt, vitc$out) |>
prop.test()
2-sample test for equality of proportions with continuity correction
data: table(vitc$trt, vitc$out)
X-squared = 5.9196, df = 1, p-value = 0.01497
alternative hypothesis: two.sided
95 percent confidence interval:
0.01391972 0.13222111
sample estimates:
prop 1 prop 2
0.8150852 0.7420147
Since the data come from a randomized experiment, we can interpret the difference in proportions as a causal effect of the treatment:
The data provide evidence that vitamin C supplements affect the probability of getting a common cold (\(\chi^2\) = 5.92 on 1 degree of freedom, p = 0.01497).
The function also provides an interval estimate for the difference in proportions:
With 95% confidence, vitamin C supplements lower the chance of getting a common cold by an estimated 1.39 to 13.22 percentage points.
Test whether asthma prevalence differs between men and women. Report the result of the test in context following conventional style and provide an interval estimate for the difference.
# inference for a difference in binomial proportions
table(asthma$sex, asthma$asthma) |>
prop.test()
2-sample test for equality of proportions with continuity correction
data: table(asthma$sex, asthma$asthma)
X-squared = 3.6217, df = 1, p-value = 0.05703
alternative hypothesis: two.sided
95 percent confidence interval:
-0.0434742223 0.0004958005
sample estimates:
prop 1 prop 2
0.03754693 0.05903614
The data do not provide evidence that asthma prevalence differs between men and women (\(\chi^2\) = 3.62 on 1 degree of freedom, p = 0.05703).
A 95% interval is returned by default. To change this, add a conf.level = ...
argument:
# adjust confidence level
table(vitc$trt, vitc$out) |>
prop.test(conf.level = 0.99)
2-sample test for equality of proportions with continuity correction
data: table(vitc$trt, vitc$out)
X-squared = 5.9196, df = 1, p-value = 0.01497
alternative hypothesis: two.sided
99 percent confidence interval:
-0.003898486 0.150039319
sample estimates:
prop 1 prop 2
0.8150852 0.7420147
One can also perform a directional test by adding an alternative = ...
argument. For instance, to test whether vitamin C prevents common cold, we’d test:
\[
\begin{cases}
H_0: &p_1 = p_2 \\
H_A: &p_1 > p_2
\end{cases}
\] Here \(p_1\) denotes the binomial proportion for the placebo group and \(p_2\) denotes the same for the treatment group. To perform this test, add alternative = 'greater'
:
# upper-sided test
table(vitc$trt, vitc$out) |>
prop.test(alternative = 'greater')
2-sample test for equality of proportions with continuity correction
data: table(vitc$trt, vitc$out)
X-squared = 5.9196, df = 1, p-value = 0.007487
alternative hypothesis: greater
95 percent confidence interval:
0.02303649 1.00000000
sample estimates:
prop 1 prop 2
0.8150852 0.7420147
The interpretation of this test is:
The data provide evidence that vitamin C is effective at preventing common cold (\(\chi^2\) = 5.92 on 1 degree of freedom, p = 0.007487).
And the interval:
With 95% confidence, vitamin C reduces the chance of getting a common cold by at least 2.3 percentage points.
While directional tests are less common, the above provides an example in which a one-sided test more directly answers the research question (since the premise of the study is that vitamin C supplements at worst have no effect on the chances of getting a cold).
Two-sample inference for multinomial data
Now let’s consider comparing health perceptions between NHANES survey years. If \(\mathbf{p}_1\) denotes the set of category proportions in the 2009-2010 survey and \(\mathbf{p}_2\) denotes the same in the 2011-2012 survey, this corresponds to the hypothesis:
\[ \begin{cases} H_0: &\mathbf{p}_1 = \mathbf{p}_2 \\ H_A: &\mathbf{p}_1 \neq \mathbf{p}_2 \end{cases} \] As in the two-sample comparison of binomial proportions, inference proceeds by:
- determining expected counts if \(\mathbf{p}_1 = \mathbf{p}_2 = \mathbf{p}\) from the table margins
- measuring the divergence of observations from expectations using the \(\chi^2\) statistic
- comparing with a \(\chi^2_{K-1}\) model to determine a p-value (where \(K\) is the number of categories)
The R
implementation is provided by chisq.test(...)
given the two-way table as input:
# inference comparing multinomial proportions
table(nhanes$surveyyr, nhanes$healthgen) |>
chisq.test()
Pearson's Chi-squared test
data: table(nhanes$surveyyr, nhanes$healthgen)
X-squared = 1.5223, df = 4, p-value = 0.8227
The data do not provide evidence that health perceptions differ between survey years (\(\chi^2\) = 1.52 on 4 degrees of freedom, p = 0.8227)
Although there is not a significant difference, if there had been we’d look at the residuals of the \(\chi^2\) test to explain the result. This is parallel to the one-sample setting, except that the residuals will be arranged in a two-way table. We need to first store the output of chisq.test(...)
and then retrieve the appropriate element:
# store test result
<- table(nhanes$surveyyr, nhanes$healthgen) |>
chisq.rslt chisq.test()
# inspect test residuals
$residuals chisq.rslt
Excellent Vgood Good Fair Poor
2009_10 0.69889824 -0.37250646 -0.09474994 0.26791188 -0.22312857
2011_12 -0.70201139 0.37416574 0.09517199 -0.26910526 0.22412247
While none of the changes were significant, an increased share of respondents reported being in very good, good, and poor health in the later survey year and fewer respondents reported being in excellent or fair health.
The general rule of thumb for inspecting residuals is to look for values in excess of \(\pm 2\).
Use the nhanes
data to test whether health perceptions differ between men and women. Report the result in context following conventional style and inspect residuals to identify significant differences (if any).
# store test result
<- table(nhanes$gender, nhanes$healthgen) |>
chisq.rslt chisq.test()
# show test result
chisq.rslt
Pearson's Chi-squared test
data: table(nhanes$gender, nhanes$healthgen)
X-squared = 6.767, df = 4, p-value = 0.1487
# inspect test residuals
$residuals chisq.rslt
Excellent Vgood Good Fair Poor
female -1.46898841 0.05252257 0.07816321 0.89445305 0.55802620
male 1.51542384 -0.05418284 -0.08063399 -0.92272715 -0.57566568
The data do not provide evidence that health perceptions differ by sex (\(\chi^2\) = 6.77 on 4 degrees of freedom, p = 0.1487). Although there is not a statistically significant difference, the largest discrepancy is in the share of individuals who perceive themselves in excellent health – the data suggest that men hold this perception more frequently than women.