By the end of the lecture, you will be able to …
Download and open code-along-05.qmd
Install the Rmisc() package, available on CRAN.
Run the code chunk in your code-along.
OR: Copy and paste the code into your Console pane. Then hit “Enter”.
Load the standard packages and our new Rmisc() package.
Load the 2022 GSS data.
wwwhr Not counting e-mail, about how many minutes or hours per week do you use the Web?postlife Do you believe there is a life after death?childs How many children have you ever had?sex Respondent’s sex Create a dataset named my_data with the variables wwwhr, postlife, childs, & sex with no missing data
summarise()# A tibble: 1 × 3
n mean sd
<int> <dbl> <dbl>
1 970 17.7 22.3
summarise() with equationWhat can we infer about the population mean based on this confidence interval?
summarise() + CI()# A tibble: 1 × 5
n mean_wwwhr sd_wwwhr lowCI_wwwhr hiCI_wwwhr
<int> <dbl> <dbl> <dbl> <dbl>
1 970 17.7 22.3 16.3 19.1
Neat! But how confident are we?
summarise() + CI()# A tibble: 1 × 5
n mean_wwwhr sd_wwwhr lowCI_wwwhr hiCI_wwwhr
<int> <dbl> <dbl> <dbl> <dbl>
1 970 17.7 22.3 15.8 19.5
95% confidence intervals
How does a 95% confidence interval differ from a 99% confidence interval in terms of width and certainty?
# A tibble: 1 × 5
n mean_ghosts sd_ghosts lowCI_ghosts hiCI_ghosts
<int> <dbl> <dbl> <dbl> <dbl>
1 970 1.19 0.389 1.15 1.22
This doesn’t work because the values aren’t coded 0 and 1.
# A tibble: 1 × 5
n mean_ghosts sd_ghosts lowCI_ghosts hiCI_ghosts
<int> <dbl> <dbl> <dbl> <dbl>
1 970 0.814 0.389 0.782 0.847


t.test()Performs one and two sample t-tests. To use this function, we’ll need to specify a few things:
t.test() in actionIn your family sociology course you learned that fertility rates have been dropping.
You used to think that people had 2 children, on average. But, now you suspect that people are having fewer than 2 children.
How do you test your hypothesis?
Which of these equations represents your research hypothesis?
Which of these equations represents your null hypothesis?
# A tibble: 1 × 4
n mean median sd
<int> <dbl> <dbl> <dbl>
1 970 1.79 2 1.64
Is the difference between our sample mean and 2 statistically significant or could we have gotten this mean simply by chance (sample variation)?
# A tibble: 1 × 7
n mean median sd se lower upper
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 970 1.79 2 1.64 0.0526 1.66 1.93
Does the confidence interval overlap with 2? What does that tell you?
t.test() for means
One Sample t-test
data: my_data$childs
t = 34.102, df = 969, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
1.690587 1.897041
sample estimates:
mean of x
1.793814
H0 = 0 by default
t.test() for means
One Sample t-test
data: my_data$childs
t = -3.9197, df = 969, p-value = 4.744e-05
alternative hypothesis: true mean is less than 2
95 percent confidence interval:
-Inf 1.88042
sample estimates:
mean of x
1.793814
H0 = 2; one-tail test
Assuming an α level of .05, should you reject the null hypothesis?
Which of these statements best summarizes your conclusion?
t.test() in actionAre the proportions of men and women who believe in ghosts statistically significantly different?
Welch Two Sample t-test
data: my_data$ghosts by my_data$sex
t = -3.2072, df = 923.29, p-value = 0.001387
alternative hypothesis: true difference in means between group male and group female is not equal to 0
95 percent confidence interval:
-0.1291289 -0.0310882
sample estimates:
mean in group male mean in group female
0.7733051 0.8534137
Assuming an α level of .05, what do you conclude?
t.test()Independent group t-test
Do two subsamples have the same mean?
Do women have more years of education (educ) than men, on average?
Are the proportions of men & women who support abortion for any reason (abany) statistically different?
Do fewer men than women support gun laws (gunlaw)?
Welch Two Sample t-test
data: gss22$educ by gss22$sex
t = -0.38076, df = 4035.7, p-value = 0.3517
alternative hypothesis: true difference in means between group 1 and group 2 is less than 0
95 percent confidence interval:
-Inf 0.1144853
sample estimates:
mean in group 1 mean in group 2
14.14338 14.17786
Welch Two Sample t-test
data: gss22$abany by gss22$sex
t = -1.0701, df = 1317.7, p-value = 0.2848
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
-0.08169512 0.02402811
sample estimates:
mean in group 1 mean in group 2
0.5786164 0.6074499
Welch Two Sample t-test
data: gss22$gunlaw by gss22$sex
t = -8.9698, df = 2476.6, p-value < 2.2e-16
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
-0.1872801 -0.1200856
sample estimates:
mean in group 1 mean in group 2
0.6450593 0.7987421