library(tidyverse)
library(DescTools)
load('data/nhanes500.RData')
Lab: One-sample inference for proportions
With solutions
The goal of this lab is to learn how to implement one-sample inference for population proportions from binomial and multinomial data. You’ll reproduce examples from lecture using the NHANES data.
Point estimation
Sample proportions are easily calculated by tabulating counts of the number of occurrences of each value by category and then normalizing the counts by the sample size. Earlier in the quarter, you learned how to do this using table(...)
and prop.table(...)
:
# tabulate counts
table(nhanes$diabetes)
Yes No
57 443
# compute sample proportions
table(nhanes$diabetes) |> prop.table()
Yes No
0.114 0.886
The resulting sample proportions provide point estimates for the population proportions.
Diabetes prevalence among U.S. adults is estimated to be 11.4%.
The sleeptrouble
variable in the NHANES dataset records whether the participant experiences sleep trouble. Estimate the proportion of U.S. adults that experience sleep trouble.
# compute sample proportions
table(nhanes$sleeptrouble) |> prop.table()
Yes No
0.278 0.722
An estimated 27.8% of U.S. adults experience sleep trouble.
For a variable with more than two categories the process of estimation is identical:
# tabulate counts
table(nhanes$healthgen)
Excellent Vgood Good Fair Poor
47 162 177 53 11
# compute sample proportions
table(nhanes$healthgen) |> prop.table()
Excellent Vgood Good Fair Poor
0.10444444 0.36000000 0.39333333 0.11777778 0.02444444
Example interpretation:
An estimated 10.44% of U.S. adults report being in excellent health.
The homeown
variable in the NHANES dataset records whether the participant owns a home, rents a home, or has another living arrangement. Estimate the proportions of U.S. adults that own, rent, or do neither. (Keep in mind this data was from 2009-2012.)
# compute sample proportions
table(nhanes$homeown) |> prop.table()
Own Rent Other
0.69979716 0.27991886 0.02028398
An estimated 69.98% of U.S. adults own a home; an estimated 27.99% of U.S. adults rent a home; and an estimated 2.03% of U.S. adults neither own nor rent.
By default, the tabulation removes missing observations. To determine how many observations are missing, one can check the sum of the counts against the sample size.
# number of tabulated (nonmissing) values
table(nhanes$healthgen) |> sum()
[1] 450
# total number of values
length(nhanes$healthgen)
[1] 500
For the general health variable, 450 observations were nonmissing, but there were 500 observations; so 50 values were missing.
How many responses are missing for the home ownership variable homeown
?
# number of tabultaed values
table(nhanes$homeown) |> sum()
[1] 493
The sample size is the same as above (\(n = 500\)) and there are 493 nonmissing values, so 7 responses are missing.
We distinguish “binomial” data having only two categories from “multinomial” data having more than two categories. Above, the diabetes variable provides an example of binomial data and the general health variable provides an example of multinomial data.
Inference for binomial data
If we wish to test whether diabetes prevalence is 10%, the relevant hypothesis is:
\[ \begin{cases} H_0: p = 0.1 \\ H_A: p \neq 0.1 \end{cases} \]
As discussed in lecture, the conceptual basis for constructing a test is to consider that if \(p = 0.1\), then one would expect the observed counts to reflect a 90-10 split; this yields expected counts of 50 diabetics and 450 nondiabetics in a sample of 500. We compare these to the observed counts using the \(\chi^2\) statistic:
\[ \chi^2 = \sum_{i = 1}^2 \frac{(\text{observed}_i - \text{expected}_i)^2}{\text{expected}_i} \]
If \(\chi^2\) is sufficiently large, then the observed and expected counts differ and the data provide evidence favoring the alternative that \(p \neq 0.1\). In R
, the prop.test(...)
function will perform this calculation and return a p-value for the test. There are two ways to specify the inputs.
- [option 1] specify the count
x
for the outcome of interest, the sample sizen
, and the hypothesis forp
- [option 2] pass a table to the function and specify the hypothesis for
p
These are show below. Note that the outputs are identical.
# test of binomial proportion (option 1)
prop.test(x = 57, n = 500, p = 0.1)
1-sample proportions test with continuity correction
data: 57 out of 500, null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.3326
alternative hypothesis: true p is not equal to 0.1
95 percent confidence interval:
0.08814952 0.14594579
sample estimates:
p
0.114
# test of binomial proportion (option 2)
table(nhanes$diabetes) |> prop.test(p = 0.1)
1-sample proportions test with continuity correction
data: table(nhanes$diabetes), null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.3326
alternative hypothesis: true p is not equal to 0.1
95 percent confidence interval:
0.08814952 0.14594579
sample estimates:
p
0.114
Interpretation of the \(\chi^2\) test in conventional style:
The data do not provide evidence that diabetes prevalence among U.S. adults differs from 10% (\(\chi^2\) = 0.94 on 1 degree of freedom, p = 0.3326).
The function also returns a confidence interval for the binomial proportion \(p\):
With 95% confidence, diabetes prevalence among U.S. adults is estimated to be between 8.81% and 14.59%.
Test whether one in five U.S. adults experience sleep trouble. Report results in context following conventional style and provide a 95% confidence interval for the proportion.
# test of binomial proportion
table(nhanes$sleeptrouble) |> prop.test(p = 0.2)
1-sample proportions test with continuity correction
data: table(nhanes$sleeptrouble), null probability 0.2
X-squared = 18.528, df = 1, p-value = 1.674e-05
alternative hypothesis: true p is not equal to 0.2
95 percent confidence interval:
0.2395872 0.3198838
sample estimates:
p
0.278
The data provide evidence that the proportion of U.S. adults who experience sleep trouble differs from 0.2 (\(\chi^2\) = 18.53 on 1 degree of freedom, p = 0.00001674). With 95% confidence, between 23.96% and 31.99% of U.S. adults experience sleep trouble.
It is possible to change both the confidence level and the direction of the alternative using prop.test(...)
. For example:
# adjust confidence level
prop.test(x = 57, n = 500, p = 0.1, conf.level = 0.99)
1-sample proportions test with continuity correction
data: 57 out of 500, null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.3326
alternative hypothesis: true p is not equal to 0.1
99 percent confidence interval:
0.0814852 0.1568967
sample estimates:
p
0.114
# upper-sided test
prop.test(x = 57, n = 500, p = 0.1, alternative = 'greater')
1-sample proportions test with continuity correction
data: 57 out of 500, null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.1663
alternative hypothesis: true p is greater than 0.1
95 percent confidence interval:
0.09176378 1.00000000
sample estimates:
p
0.114
We didn’t discuss directional tests in detail in lecture. Here’s how you’d interpret the test above:
The data do not provide evidence that diabetes prevalence among U.S. adults exceeds 10% (\(\chi^2\) = 0.94 on 1 degree of freedom, p = 0.1663).
And the corresponding one-sided interval:
With 95% confidence, diabetes prevalence among U.S. adults is estimated to be at least 9.18%.
Inference for multinomial data
If we wish to test whether health perceptions differ from a baseline characterized by the set of proportions \[\mathbf{p}_0 = (p_{01}, p_{02}, \dots, p_{05}) = (0.13, 0.32, 0.35, 0.13, 0.06)\] the relevant hypothesis is:
\[ \begin{cases} H_0: &\mathbf{p} = \mathbf{p}_0 \\ H_A: &\mathbf{p} \neq \mathbf{p}_0 \end{cases} \]
Just as for binomial data, we determine expected counts when \(\mathbf{p} = \mathbf{p}_0\) and then measure the divergence of observations from expectations under \(H_0\) using a \(\chi^2\) statistic, which is then compared with a \(\chi^2\) model to compute a p-value for the test. In R
, the chisq.test(...)
function provides an implementation. As with prop.test(...)
, the counts can be specified directly or supplied as a table:
# chi square test (option 1)
chisq.test(x = c(47, 162, 177, 53, 11),
p = c(0.13, 0.32, 0.35, 0.13, 0.06),
rescale.p = T)
Chi-squared test for given probabilities
data: c(47, 162, 177, 53, 11)
X-squared = 16.709, df = 4, p-value = 0.002201
# chi square test (option 2)
table(nhanes$healthgen) |>
chisq.test(p = c(0.13, 0.32, 0.35, 0.13, 0.06),
rescale.p = T)
Chi-squared test for given probabilities
data: table(nhanes$healthgen)
X-squared = 16.709, df = 4, p-value = 0.002201
The data provide evidence that health perceptions differ from the hypothesized baseline (\(\chi^2\) = 16.71 on 4 degrees of freedom, p = 0.0022).
The rescale.p = T
normalizes the hypothesis p
to ensure that proportions sum to 1. Be careful using this – you want to ensure that it is only adjusting for rounding errors.
Suppose that someone asserts 70% of U.S. adults own a home and a quarter rent. Test whether the NHANES data support or contradict this hypothesis.
# chi square test
table(nhanes$homeown) |>
chisq.test(p = c(0.7, 0.25, 0.05))
Chi-squared test for given probabilities
data: table(nhanes$homeown)
X-squared = 10.472, df = 2, p-value = 0.005321
The data provide evidence that the shares of homeowners and renters differ from the asserted figures (\(\chi^2\) = 10.47 on 2 degrees of freedom, p = 0.0053).
When a significant difference is identified, it is common to inspect residuals – normalized signed differences between observed and expected counts – to explain which categories appear to differ. The chisq.test(...)
function computes residuals but does not print them by default. To retrieve and inspect the residuals, store the output of the test and then find the corresponding element of the stored output:
# store test output
<- table(nhanes$healthgen) |>
chisq.rslt chisq.test(p = c(0.13, 0.32, 0.35, 0.13, 0.06),
rescale.p = T)
# inspect residuals
$residuals chisq.rslt
Excellent Vgood Good Fair Poor
-1.5728910 1.3718766 1.4198774 -0.7923586 -3.1159900
The difference from baseline is due to far fewer than expected respondents reporting being in poor health.
As a rule of thumb, you want to look for residuals larger than \(\pm 2\) in magnitude. An interesting point above is that the Poor
category differs significantly from the hypothesized value but none of the others do. Yet, due to compositionality, we know that this difference must be offset by higher-than-expected values elsewhere – it just so happens that they’re distributed in such a way that the other categories remain consistent with the hypothesized values.
Inspect the residuals from the \(\chi^2\) test you performed for the previous exercise. Explain which (if any) proportions differ from the asserted figures.
# store test output
<- table(nhanes$homeown) |>
chisq.rslt chisq.test(p = c(0.7, 0.25, 0.05))
# inspect residuals
$residuals chisq.rslt
Own Rent Other
-0.005383039 1.328613180 -2.950727899
Significantly fewer respondents than expected neither own nor rent.
Lastly, we can construct pointwise confidence intervals for the proportions using the MultinomCI
function from the DescTools
package:
# confidence intervals for multinomial proportions
table(nhanes$healthgen) |>
MultinomCI(method = 'wald', conf.level = 0.95)
est lwr.ci upr.ci
Excellent 0.10444444 0.07618714 0.13270175
Vgood 0.36000000 0.31565108 0.40434892
Good 0.39333333 0.34820001 0.43846665
Fair 0.11777778 0.08799518 0.14756037
Poor 0.02444444 0.01017661 0.03871227
Recall from lecture that these estimates are simply separate intervals for binomial proportions (that’s what method = 'wald'
does); they are not adjusted for multiplicity. To implement a Bonferroni correction, we should manually change the confidence level by dividing the noncoverage rate by the number of categories:
# bonferroni correction
table(nhanes$healthgen) |>
MultinomCI(method = 'wald', conf.level = 1 - 0.05/5)
est lwr.ci upr.ci
Excellent 0.10444444 0.067308048 0.14158084
Vgood 0.36000000 0.301715636 0.41828436
Good 0.39333333 0.334018097 0.45264857
Fair 0.11777778 0.078636816 0.15691874
Poor 0.02444444 0.005693337 0.04319555
Example interpretation:
With 95% confidence, the share of U.S. adults who perceive themselves to be in excellent health is estimated to be between 6.73% and 14.16%.
Construct simultaneous 95% confidence intervals for the share of U.S. adults that rent, own, and do neither. Interpret the interval for the share of homeowners in context following conventional style.
# confidence intervals with bonferroni correction
table(nhanes$homeown) |>
MultinomCI(method = 'wald', conf.level = 1 - 0.05/3)
est lwr.ci upr.ci
Own 0.69979716 0.650378552 0.74921577
Rent 0.27991886 0.231512348 0.32832538
Other 0.02028398 0.005084673 0.03548328
With 95% confidence, the share of U.S. adults who own a home is estimated to be between 65.04% and 74.92%.