Welcome to STA 199

Lecture 1

Dr. Mine Çetinkaya-Rundel

Duke University
STA 199 - Fall 2022

8/30/22

Hello world!

Meet the prof

Dr. Mine Çetinkaya-Rundel

Professor of the Practice

Old Chem 213

Headshot of Dr. Mine Çetinkaya-Rundel

Meet the course team

  • Foxx Hart - Lecture helper
  • Bora Jin - Course organizer, lab leader
  • Jordan Bryan - Head TA, lab leader
  • Richard Fremgen - Lab leader
  • Leah Johnson - Lab leader
  • Alison Reynolds - Lab leader
  • Brooke Harmon - Lab helper
  • Jasmine Wen - Lab helper
  • Amber Potter - Lab helper
  • Shawon (One) Chowdhury - Lab helper
  • Gaurav Sirdeshmukh - Lab helper

Meet each other!

03:00

Meet data science

  • Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and knowledge.

  • We’re going to learn to do this in a tidy way – more on that later!

  • This is a course on introduction to data science, with an emphasis on statistical thinking.

Software

Excel - not…

An Excel window with data about countries

R

An R shell

RStudio

An RStudio window

Data science life cycle

Data science life cycle

Data science life cycle

Import

Data science life cycle, with import highlighted

Tidy + transform

Data science life cycle, with tidy and transform highlighted

Visualize

Data science life cycle, with visualize highlighted

Model

Data science life cycle, with model highlighted

Understand

Data science life cycle, with understand highlighted

# A tibble: 5 × 2
  date             season
  <chr>            <chr> 
1 23 January 2017  winter
2 4 March 2017     spring
3 14 June 2017     summer
4 1 September 2017 fall  
5 ...              ...   

Communicate

Data science life cycle, with communicate highlighted

Understand + communicate

Data science life cycle, with understand and communicate highlighted

Program

Data science life cycle, with program highlighted

Let’s dive in!

Application exercise

Or more like demo for today…

📋 github.com/sta199-f22-1/ae-0-unvotes

Course overview

Homepage

https://sta199-f22-1.github.io/

  • All course materials
  • Links to Sakai, GitHub, RStudio containers, etc.
  • Let’s take a tour!

Course toolkit

All linked from the course website:

Important

Reserve an RStudio Container (titled STA 198-199) before lab on Monday!

Activities: Prepare, Participate, Practice, Perform

  • Prepare: Introduce new content and prepare for lectures by watching the videos and completing the readings
  • Participate: Attend and actively participate in lectures and labs, office hours, team meetings
  • Practice: Practice applying statistical concepts and computing with application exercises during lecture, graded for completion
  • Perform: Put together what you’ve learned to analyze real-world data
    • Lab assignments x 7 (first individual, later team-based)
    • Homework assignments x 5 (individual)
    • Three take-home exams x 2
    • Term project presented in the last lab session

Cadence

  • Labs: Start and make large progress on Monday in lab section, finish up by Friday 5pm of that week
  • HWs: Posted Thursday morning, due following Thursday
  • Exams: Exam released Thursday morning, no lab on Monday of following week, due Monday 2pm
  • Project: Deadlines throughout the semester, with some lab and lecture time dedicated to working on them, and most work done in teams outside of class

Teams

  • Team assignments
    • Assigned by me
    • Application exercises, labs, and project
    • Peer evaluation during teamwork and after completion
  • Expectations and roles
    • Everyone is expected to contribute equal effort
    • Everyone is expected to understand all code turned in
    • Individual contribution evaluated by peer evaluation, commits, etc.

Grading

Category Percentage
Homework 30% (5% x 6)
Labs 14% (2% x 7)
Project 15%
Exam 01 18%
Exam 02 18%
Application Exercises 2.5%
Teamwork 2.5%

See course syllabus for how the final letter grade will be determined.

Support

  • Attend office hours
  • Ask and answer questions on the discussion forum
  • Reserve email for questions on personal matters and/or grades
  • Read the course support page

Announcements

  • Posted on Sakai (Announcements tool) and sent via email, be sure to check both regularly
  • I’ll assume that you’ve read an announcement by the next “business” day
  • I’ll (try my best to) send a weekly update announcement each Friday, outlining the plan for the following week and reminding you what you need to do to prepare, practice, and perform

Diversity + inclusion

It is my intent that students from all diverse backgrounds and perspectives be well-served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that the students bring to this class be viewed as a resource, strength and benefit.

  • Please let me know your preferred name and pronouns on the Getting to know you survey.
  • If you feel like your performance in the class is being impacted by your experiences outside of class, please don’t hesitate to come and talk with me. I want to be a resource for you. If you prefer to speak with someone outside of the course, your advisors and deans are excellent resources.
  • I (like many people) am still in the process of learning about diverse perspectives and identities. If something was said in class (by anyone) that made you feel uncomfortable, please talk to me about it.

Accessibility

  • The Student Disability Access Office (SDAO) is available to ensure that students are able to engage with their courses and related assignments.

  • I am committed to making all course materials accessible and I’m always learning how to do this better. If any course component is not accessible to you in any way, please don’t hesitate to let me know.

Course policies

COVID policies

  • Wear a mask at all times!

  • Read and follow university guidance

Late work, waivers, regrades policy

  • We have policies!
  • Read about them on the course syllabus and refer back to them when you need it

Collaboration policy

  • Only work that is clearly assigned as team work should be completed collaboratively.

  • Homeworks must be completed individually. You may not directly share answers / code with others, however you are welcome to discuss the problems in general and ask for advice.

  • Exams must be completed individually. You may not discuss any aspect of the exam with peers. If you have questions, post as private questions on the course forum, only the teaching team will see and answer.

Sharing / reusing code policy

  • We are aware that a huge volume of code is available on the web, and many tasks may have solutions posted

  • Unless explicitly stated otherwise, this course’s policy is that you may make use of any online resources (e.g. RStudio Community, StackOverflow, etc.) but you must explicitly cite where you obtained any code you directly use or use as inspiration in your solution(s).

  • Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism, regardless of source

Academic integrity

To uphold the Duke Community Standard:

  • I will not lie, cheat, or steal in my academic endeavors;

  • I will conduct myself honorably in all my endeavors; and

  • I will act if the Standard is compromised.

Most importantly!

Ask if you’re not sure if something violates a policy!

This week’s tasks

  • Get a GitHub account if you don’t have one (some advice for choosing a username here)
  • Complete the Getting to know you survey if you haven’t yet done so!
  • Read the syllabus

Midori says…

Picture of my cat, Midori, with a speech bubble that says "Read the syllabus and make Mine happy!"

It’s the R community! 💛

Screenshot of a tweet from Rafa Moral saying he recorded an R version of the "it's corn" viral hit