Class activity work

Code
students <- c("David", "Dibaloke", "Emily", "Tedi", "Alberto", 
              "Caldwell", "Eimienwanlan", "Edward", "Mark", "Adam", 
              "Jennifer", "Josh", "Kathryn", "Clint", "Brigida")
generate_group <- function(all, leader_can, seed, 
                           no_lead1 = 2, no_lead2 = 2, no_mem1 = 6) {
    students <- all
    set.seed(seed)
    lead1 <- sample(leader_can, no_lead1)
    students <- setdiff(students, lead1)
    leader <- setdiff(leader_can, lead1)
    set.seed(seed)
    member1 <- sample(students, no_mem1)
    students <- setdiff(students, member1)
    leader <- setdiff(leader, member1)
    set.seed(seed)
    lead2 <- sample(leader, no_lead2)
    member2 <- setdiff(students, lead2)
    return(list("G1_leader" = lead1,
                "G1_member" = member1,
                "G2_leader" = lead2,
                "G2_member" = member2,
                "leader_can" = setdiff(leader_can, c(lead1, lead2))))
}

Class activity 1

What is bootstrapping? Why do we use bootstrapping? Please explain it and show how it can be used in simple linear regression, including code demo.

Resources

Course slides Simulation-based Inference could be helpful. Please collect and prepare your own materials.

Code
act1 <- generate_group(all = students, leader_can = students, seed = 6250)

Group 1

Slides

R code example-1 example-2

Leaders:

[1] "Dibaloke" "Mark"    

Members:

[1] "Emily"    "Jennifer" "Alberto"  "Clint"    "Kathryn"  "David"   

Group 2

Slides

Leaders:

[1] "Caldwell" "Tedi"    

Members:

[1] "Eimienwanlan" "Edward"       "Adam"         "Josh"         "Brigida"     

Knowledge Check

Please go to https://forms.office.com/r/q7MXBw3raw to answer the knowledge check questions.

Evaluation

For leaders, please go to https://forms.office.com/r/D2RPbi5VK2 to nominate most engaged members. Note that you two leaders make one single decision and nominate members jointly.

For group members, please go to https://forms.office.com/r/WUCKQNJ3Sd to evaluate your leaders’ performance.

Class activity 2

Tip

Focus on the concepts and ideas, and avoid mathematical details.

  1. Define Generalized Linear Models (GLMs) and Generalized Additive Models (GAMs). What are their similarities and differences?

  2. Why using these models? What are the main properties of the two models? What can they do but multiple linear regression can’t?

  3. Provide an example or special case of GLM and GAM.

  4. Code demo how to run a GLM and GAM.

Code
leader_can <- setdiff(students, c(act1$G1_leader, act1$G2_leader))
act2 <- generate_group(all = students, leader_can = leader_can, seed = 6250)

Group 1

Slides

R code GLM GAM

Leaders:

[1] "Emily"   "Kathryn"

Members:

[1] "Dibaloke" "Adam"     "Alberto"  "Clint"    "Josh"     "David"   

Group 2

Slides

R code GLM GAM

Leaders:

[1] "Edward"       "Eimienwanlan"

Members:

[1] "Tedi"     "Caldwell" "Mark"     "Jennifer" "Brigida" 

Knowledge Check

Please go to https://forms.office.com/r/Bt5rVJ5igB to answer the knowledge check questions.

Evaluation

For leaders, please go to https://forms.office.com/r/Ys2rAFW8pe to nominate most engaged members. Note that you two leaders make one single decision and nominate members jointly.

For group members, please go to https://forms.office.com/r/tk9FrnV9ZM to evaluate your leaders’ performance.

Class activity 3

Tip

Focus on the concepts and ideas, and avoid mathematical details.

  1. Define and explain the meaning of a loss function in machine learning.

  2. Compare loss functions used in regression, including mean square error (MSE, L2 loss), mean absolute error (MAE, L1 loss), and Huber loss. What are their properties? Use simulated data with outliers to explain why the fitting with L1 loss or Huber loss is more robust than OLS with the L2 loss.

  3. Explain the log loss used for the binary logistic regression. Define and explain the meaning of maximum likelihood estimation. Define and explain the meaning of cross entropy. How is the log loss related to the maximum likelihood and cross-entropy?

Code
leader_can <- setdiff(students, 
                      c(act1$G1_leader, act1$G2_leader,
                        act2$G1_leader, act2$G2_leader))
act3 <- generate_group(all = students, 
                       leader_can = leader_can, 
                       seed = 6250)

Group 1

Leaders:

[1] "Alberto" "David"  

Members:

[1] "Emily"    "Jennifer" "Caldwell" "Clint"    "Kathryn"  "Dibaloke"

Group 2

Leaders:

[1] "Josh" "Adam"

Members:

[1] "Tedi"         "Eimienwanlan" "Edward"       "Mark"         "Brigida"     

Knowledge Check

Please go to https://forms.office.com/r/WhjhuJ781U to answer the knowledge check questions.

Evaluation

For leaders, please go to https://forms.office.com/r/ZYt35Czaxr to nominate most engaged members. Note that you two leaders make one single decision and nominate members jointly.

For group members, please go to https://forms.office.com/r/uC8wJMSEtn to evaluate your leaders’ performance.

Class activity 4

Topic: Principal Component Regression and Partial Least Squares

Tip

Focus on the concepts and ideas, and avoid mathematical details.

  1. First explain the two methods principal component regression (PCR) and partial least squares (PLS). Why and when do we use them? Discuss their similarities and differences.
  1. Code demo of the two methods.

Group 1

Leaders:

[1] "Jennifer" "Brigida" 

Members:

[1] "Dibaloke" "Mark"     "Tedi"     "Josh"     "Adam"     "David"   

Group 2

Leaders:

[1] "Clint"

Members:

[1] "Emily"        "Alberto"      "Caldwell"     "Eimienwanlan" "Edward"      
[6] "Kathryn"     

Knowledge Check

Please go to https://forms.office.com/r/zCh1YfFCiZ to answer the knowledge check questions.

Evaluation

For leaders, please go to https://forms.office.com/r/0gsHsVSpkJ to nominate most engaged members. Note that you two leaders make one single decision and nominate members jointly.

For group members, please go to https://forms.office.com/r/AkAUYvucsD to evaluate your leaders’ performance. :::