Working with multiple data frames

Lecture 6

Dr. Mine Çetinkaya-Rundel

Duke University
STA 199 - Fall 2022

9/15/22

Warm up

While you wait for class to begin…

  • Open your ae-03 project in RStudio (that you already started on Tuesday), render your document, and commit and push.
  • Any questions from prepare materials? Go to slido.com / #sta199. You can also upvote others’ questions.

Announcements

  • Recap: Asking code related questions on Slack
    • Ideally: Code formatted text, not screenshots
    • If need be: Screenshots, not photos of screens
    • Always include code along with the error
  • Troubleshooting 101: Read your error messages in full, out loud if need be
  • Code formatting 101:
    • Always line breaks after + and |>

    • Add line breaks as needed after , to help fit code on rendered PDF

    • Think poetry (short lines), not novellas (long sentences)

ae-03

Continue work from Tuesday…

Joining datasets

Describe a scenario where two datasets that contain information about students from this class may need to be joined. What might the analysis be about? What column (information) could be used to join the datasets?

03:00

Application exercise

Goal

Join data from multiple data frames, summarize it, and create this plot.

ae-04

  • Go to the course GitHub org and find your ae-04 (repo name will be suffixed with your GitHub name).
  • Clone the repo in your container, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – 3 days from today.

Recap of AE

  • TBD…