Lab: One-sample inference for proportions

With solutions

The goal of this lab is to learn how to implement one-sample inference for population proportions from binomial and multinomial data. You’ll reproduce examples from lecture using the NHANES data.

library(tidyverse)
library(DescTools)
load('data/nhanes500.RData')

Point estimation

Sample proportions are easily calculated by tabulating counts of the number of occurrences of each value by category and then normalizing the counts by the sample size. Earlier in the quarter, you learned how to do this using table(...) and prop.table(...):

# tabulate counts
table(nhanes$diabetes)

Yes  No 
 57 443 
# compute sample proportions
table(nhanes$diabetes) |> prop.table()

  Yes    No 
0.114 0.886 

The resulting sample proportions provide point estimates for the population proportions.

Diabetes prevalence among U.S. adults is estimated to be 11.4%.

Your turn

The sleeptrouble variable in the NHANES dataset records whether the participant experiences sleep trouble. Estimate the proportion of U.S. adults that experience sleep trouble.

# compute sample proportions
table(nhanes$sleeptrouble) |> prop.table()

  Yes    No 
0.278 0.722 

An estimated 27.8% of U.S. adults experience sleep trouble.

For a variable with more than two categories the process of estimation is identical:

# tabulate counts
table(nhanes$healthgen)

Excellent     Vgood      Good      Fair      Poor 
       47       162       177        53        11 
# compute sample proportions
table(nhanes$healthgen) |> prop.table()

 Excellent      Vgood       Good       Fair       Poor 
0.10444444 0.36000000 0.39333333 0.11777778 0.02444444 

Example interpretation:

An estimated 10.44% of U.S. adults report being in excellent health.

Your turn

The homeown variable in the NHANES dataset records whether the participant owns a home, rents a home, or has another living arrangement. Estimate the proportions of U.S. adults that own, rent, or do neither. (Keep in mind this data was from 2009-2012.)

# compute sample proportions
table(nhanes$homeown) |> prop.table()

       Own       Rent      Other 
0.69979716 0.27991886 0.02028398 

An estimated 69.98% of U.S. adults own a home; an estimated 27.99% of U.S. adults rent a home; and an estimated 2.03% of U.S. adults neither own nor rent.

By default, the tabulation removes missing observations. To determine how many observations are missing, one can check the sum of the counts against the sample size.

# number of tabulated (nonmissing) values
table(nhanes$healthgen) |> sum()
[1] 450
# total number of values
length(nhanes$healthgen)
[1] 500

For the general health variable, 450 observations were nonmissing, but there were 500 observations; so 50 values were missing.

Your turn

How many responses are missing for the home ownership variable homeown?

# number of tabultaed values
table(nhanes$homeown) |> sum()
[1] 493

The sample size is the same as above (\(n = 500\)) and there are 493 nonmissing values, so 7 responses are missing.

We distinguish “binomial” data having only two categories from “multinomial” data having more than two categories. Above, the diabetes variable provides an example of binomial data and the general health variable provides an example of multinomial data.

Inference for binomial data

If we wish to test whether diabetes prevalence is 10%, the relevant hypothesis is:

\[ \begin{cases} H_0: p = 0.1 \\ H_A: p \neq 0.1 \end{cases} \]

As discussed in lecture, the conceptual basis for constructing a test is to consider that if \(p = 0.1\), then one would expect the observed counts to reflect a 90-10 split; this yields expected counts of 50 diabetics and 450 nondiabetics in a sample of 500. We compare these to the observed counts using the \(\chi^2\) statistic:

\[ \chi^2 = \sum_{i = 1}^2 \frac{(\text{observed}_i - \text{expected}_i)^2}{\text{expected}_i} \]

If \(\chi^2\) is sufficiently large, then the observed and expected counts differ and the data provide evidence favoring the alternative that \(p \neq 0.1\). In R, the prop.test(...) function will perform this calculation and return a p-value for the test. There are two ways to specify the inputs.

  • [option 1] specify the count x for the outcome of interest, the sample size n, and the hypothesis for p
  • [option 2] pass a table to the function and specify the hypothesis for p

These are show below. Note that the outputs are identical.

# test of binomial proportion (option 1)
prop.test(x = 57, n = 500, p = 0.1)

    1-sample proportions test with continuity correction

data:  57 out of 500, null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.3326
alternative hypothesis: true p is not equal to 0.1
95 percent confidence interval:
 0.08814952 0.14594579
sample estimates:
    p 
0.114 
# test of binomial proportion (option 2)
table(nhanes$diabetes) |> prop.test(p = 0.1)

    1-sample proportions test with continuity correction

data:  table(nhanes$diabetes), null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.3326
alternative hypothesis: true p is not equal to 0.1
95 percent confidence interval:
 0.08814952 0.14594579
sample estimates:
    p 
0.114 

Interpretation of the \(\chi^2\) test in conventional style:

The data do not provide evidence that diabetes prevalence among U.S. adults differs from 10% (\(\chi^2\) = 0.94 on 1 degree of freedom, p = 0.3326).

The function also returns a confidence interval for the binomial proportion \(p\):

With 95% confidence, diabetes prevalence among U.S. adults is estimated to be between 8.81% and 14.59%.

Your turn

Test whether one in five U.S. adults experience sleep trouble. Report results in context following conventional style and provide a 95% confidence interval for the proportion.

# test of binomial proportion
table(nhanes$sleeptrouble) |> prop.test(p = 0.2)

    1-sample proportions test with continuity correction

data:  table(nhanes$sleeptrouble), null probability 0.2
X-squared = 18.528, df = 1, p-value = 1.674e-05
alternative hypothesis: true p is not equal to 0.2
95 percent confidence interval:
 0.2395872 0.3198838
sample estimates:
    p 
0.278 

The data provide evidence that the proportion of U.S. adults who experience sleep trouble differs from 0.2 (\(\chi^2\) = 18.53 on 1 degree of freedom, p = 0.00001674). With 95% confidence, between 23.96% and 31.99% of U.S. adults experience sleep trouble.

It is possible to change both the confidence level and the direction of the alternative using prop.test(...). For example:

# adjust confidence level
prop.test(x = 57, n = 500, p = 0.1, conf.level = 0.99)

    1-sample proportions test with continuity correction

data:  57 out of 500, null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.3326
alternative hypothesis: true p is not equal to 0.1
99 percent confidence interval:
 0.0814852 0.1568967
sample estimates:
    p 
0.114 
# upper-sided test
prop.test(x = 57, n = 500, p = 0.1, alternative = 'greater')

    1-sample proportions test with continuity correction

data:  57 out of 500, null probability 0.1
X-squared = 0.93889, df = 1, p-value = 0.1663
alternative hypothesis: true p is greater than 0.1
95 percent confidence interval:
 0.09176378 1.00000000
sample estimates:
    p 
0.114 

We didn’t discuss directional tests in detail in lecture. Here’s how you’d interpret the test above:

The data do not provide evidence that diabetes prevalence among U.S. adults exceeds 10% (\(\chi^2\) = 0.94 on 1 degree of freedom, p = 0.1663).

And the corresponding one-sided interval:

With 95% confidence, diabetes prevalence among U.S. adults is estimated to be at least 9.18%.

Inference for multinomial data

If we wish to test whether health perceptions differ from a baseline characterized by the set of proportions \[\mathbf{p}_0 = (p_{01}, p_{02}, \dots, p_{05}) = (0.13, 0.32, 0.35, 0.13, 0.06)\] the relevant hypothesis is:

\[ \begin{cases} H_0: &\mathbf{p} = \mathbf{p}_0 \\ H_A: &\mathbf{p} \neq \mathbf{p}_0 \end{cases} \]

Just as for binomial data, we determine expected counts when \(\mathbf{p} = \mathbf{p}_0\) and then measure the divergence of observations from expectations under \(H_0\) using a \(\chi^2\) statistic, which is then compared with a \(\chi^2\) model to compute a p-value for the test. In R, the chisq.test(...) function provides an implementation. As with prop.test(...), the counts can be specified directly or supplied as a table:

# chi square test (option 1)
chisq.test(x = c(47, 162, 177, 53, 11),
           p = c(0.13, 0.32, 0.35, 0.13, 0.06),
           rescale.p = T)

    Chi-squared test for given probabilities

data:  c(47, 162, 177, 53, 11)
X-squared = 16.709, df = 4, p-value = 0.002201
# chi square test (option 2)
table(nhanes$healthgen) |>
  chisq.test(p = c(0.13, 0.32, 0.35, 0.13, 0.06),
             rescale.p = T)

    Chi-squared test for given probabilities

data:  table(nhanes$healthgen)
X-squared = 16.709, df = 4, p-value = 0.002201

The data provide evidence that health perceptions differ from the hypothesized baseline (\(\chi^2\) = 16.71 on 4 degrees of freedom, p = 0.0022).

The rescale.p = T normalizes the hypothesis p to ensure that proportions sum to 1. Be careful using this – you want to ensure that it is only adjusting for rounding errors.

Your turn

Suppose that someone asserts 70% of U.S. adults own a home and a quarter rent. Test whether the NHANES data support or contradict this hypothesis.

# chi square test
table(nhanes$homeown) |>
  chisq.test(p = c(0.7, 0.25, 0.05))

    Chi-squared test for given probabilities

data:  table(nhanes$homeown)
X-squared = 10.472, df = 2, p-value = 0.005321

The data provide evidence that the shares of homeowners and renters differ from the asserted figures (\(\chi^2\) = 10.47 on 2 degrees of freedom, p = 0.0053).

When a significant difference is identified, it is common to inspect residuals – normalized signed differences between observed and expected counts – to explain which categories appear to differ. The chisq.test(...) function computes residuals but does not print them by default. To retrieve and inspect the residuals, store the output of the test and then find the corresponding element of the stored output:

# store test output
chisq.rslt <- table(nhanes$healthgen) |> 
  chisq.test(p = c(0.13, 0.32, 0.35, 0.13, 0.06),
             rescale.p = T)

# inspect residuals
chisq.rslt$residuals

 Excellent      Vgood       Good       Fair       Poor 
-1.5728910  1.3718766  1.4198774 -0.7923586 -3.1159900 

The difference from baseline is due to far fewer than expected respondents reporting being in poor health.

As a rule of thumb, you want to look for residuals larger than \(\pm 2\) in magnitude. An interesting point above is that the Poor category differs significantly from the hypothesized value but none of the others do. Yet, due to compositionality, we know that this difference must be offset by higher-than-expected values elsewhere – it just so happens that they’re distributed in such a way that the other categories remain consistent with the hypothesized values.

Your turn

Inspect the residuals from the \(\chi^2\) test you performed for the previous exercise. Explain which (if any) proportions differ from the asserted figures.

# store test output
chisq.rslt <- table(nhanes$homeown) |>
  chisq.test(p = c(0.7, 0.25, 0.05))

# inspect residuals
chisq.rslt$residuals

         Own         Rent        Other 
-0.005383039  1.328613180 -2.950727899 

Significantly fewer respondents than expected neither own nor rent.

Lastly, we can construct pointwise confidence intervals for the proportions using the MultinomCI function from the DescTools package:

# confidence intervals for multinomial proportions
table(nhanes$healthgen) |>
  MultinomCI(method = 'wald', conf.level = 0.95)
                 est     lwr.ci     upr.ci
Excellent 0.10444444 0.07618714 0.13270175
Vgood     0.36000000 0.31565108 0.40434892
Good      0.39333333 0.34820001 0.43846665
Fair      0.11777778 0.08799518 0.14756037
Poor      0.02444444 0.01017661 0.03871227

Recall from lecture that these estimates are simply separate intervals for binomial proportions (that’s what method = 'wald' does); they are not adjusted for multiplicity. To implement a Bonferroni correction, we should manually change the confidence level by dividing the noncoverage rate by the number of categories:

# bonferroni correction
table(nhanes$healthgen) |>
  MultinomCI(method = 'wald', conf.level = 1 - 0.05/5)
                 est      lwr.ci     upr.ci
Excellent 0.10444444 0.067308048 0.14158084
Vgood     0.36000000 0.301715636 0.41828436
Good      0.39333333 0.334018097 0.45264857
Fair      0.11777778 0.078636816 0.15691874
Poor      0.02444444 0.005693337 0.04319555

Example interpretation:

With 95% confidence, the share of U.S. adults who perceive themselves to be in excellent health is estimated to be between 6.73% and 14.16%.

Your turn

Construct simultaneous 95% confidence intervals for the share of U.S. adults that rent, own, and do neither. Interpret the interval for the share of homeowners in context following conventional style.

# confidence intervals with bonferroni correction
table(nhanes$homeown) |>
  MultinomCI(method = 'wald', conf.level = 1 - 0.05/3)
             est      lwr.ci     upr.ci
Own   0.69979716 0.650378552 0.74921577
Rent  0.27991886 0.231512348 0.32832538
Other 0.02028398 0.005084673 0.03548328

With 95% confidence, the share of U.S. adults who own a home is estimated to be between 65.04% and 74.92%.