Exam 2 review

Lecture 23

Dr. Mine Çetinkaya-Rundel

Duke University
STA 199 - Fall 2022

11/17/22

Warm up

Announcements

Exam 2 is released on today at noon and is due at 2pm on Monday.
- No TA OH during the exam.
- I will have OH 4-5pm on Friday (on Zoom).
- Any clarification questions must be emailed to me only.
- No Slack use during the exam, even about non-exam related questions.
Answer keys:
- HW 4 and HW 5 feedback: Keys both posted, feedback coming soon.
- All lab keys also posted.
- AE 16 key missing complete answers, will post after class.

Review questions

Questions: Grouping

Does order of variables in group_by() matter?

library(palmerpenguins)
library(tidyverse)

penguins |>
  group_by(species, sex) |>
  summarize(mean_bm = mean(body_mass_g))

# A tibble: 8 × 3
# Groups:   species [3]
  species   sex    mean_bm
  <fct>     <fct>    <dbl>
1 Adelie    female   3369.
2 Adelie    male     4043.
3 Adelie    <NA>       NA 
4 Chinstrap female   3527.
5 Chinstrap male     3939.
6 Gentoo    female   4680.
7 Gentoo    male     5485.
8 Gentoo    <NA>       NA

penguins |>
  group_by(sex, species) |>
  summarize(mean_bm = mean(body_mass_g))

# A tibble: 8 × 3
# Groups:   sex [3]
  sex    species   mean_bm
  <fct>  <fct>       <dbl>
1 female Adelie      3369.
2 female Chinstrap   3527.
3 female Gentoo      4680.
4 male   Adelie      4043.
5 male   Chinstrap   3939.
6 male   Gentoo      5485.
7 <NA>   Adelie        NA 
8 <NA>   Gentoo        NA

Questions: Factors

When will we use factors and how does that make a difference in the data?
When do you use fct_relevel() versus fct_reorder()?
How to use case_when() function and the proper use of forcats functions like fct_relevel() , fct_reorder(), fct_other()?

Review: https://r4ds.hadley.nz/factors.html.

Questions: LaTeX / equations

What is the name of the math symbol text we use to write equations? Is there a cheat sheet with the shortcuts for each symbol?
Do we have to use LaTeX for our equations?

\(H_0:\mu_1 - \mu_2 = 0\)

\(H_A: \mu_1 - \mu_2 \ne 0\)

$H_0:\mu_1 - \mu_2 = 0$

$H_A: \mu_1 - \mu_2 \ne 0$

H0: mu1 - mu2 = 0

HA: mu1 - mu2 ≠ 0

Questions: Ethics

Would love to review data ethics and how to answer questions about ethical issues with any dataset.

Review: The videos from the Ethics module.

Questions: Miscellaneous

What does geom_smooth(method = “loess”) do?

Fits a non-linear model to the data, a smooth curve.

How do we know how much to round by?

Round as much as it makes sense in the context of the data. Avoid rounding in interim steps.

Question: Inference

How do we decide whether to use bootstrap, simulate, or permute in the generate() step of inference?

Bootstrap: For constructing bootstrap intervals or for testing for a single mean (\(H0: \mu_0 = 5\))
Simulate: For testing for a single proportion (\(H_0: p_0 = 0.3\))
Permute: For testing for independence, i.e., for testing for differences in means or proportions across groups (or whether one is less/greater than the other)

Application exercise

`ae-19`

Go to the course GitHub org and find your ae-19 (repo name will be suffixed with your GitHub name).
Clone the repo in your container, open the Quarto document in the repo, and follow along and complete the exercises.
You should have already pushed updates on Tuesday, so you should be good for submission.