Model Comparison Seminar

TU Dortmund, Summer semester 2024

Today

A change to the schedule - see Course schedule
Discussion of Hastie et al. (2009)
Plan for the next week:
1. Submit topic preferences
2. Read Shmueli (2010)

Discussion of Hastie et al. (2009)

Hastie et al. (2009)

Who has not read the paper?
Who has read the paper?

Train and test error

How does the paper define generalization?
How is the Training error defined?
- What does it do as models are increasingly more complex?
How is the Test error defined?
- What does it do as models are increasingly more complex?

Goals and data split

What are the two goals when we examine the train and test error of models?
What are the purposes of a training set, validation set, and a test set?
- When is it appropriate to use such splits?

Optimism

Why is the training error typically less than the test error?
How does the article define an in-sample error?
What is an optimism?
- What does it do when we increase the complexity of the model?
- What does it do when we increase the sample size?

Methods

Which methods estimate the in-sample error
Which methods estimate the extra-sample error
Why is in-sample error often considered instead of the extra-sample error?

AIC

How is AIC motivated (general rationale)?
Why does the AIC underestimate the test error in Figure 7.4 (left)for the most complex model?
The article says the AIC “does a reasonable job” for the 0-1 loss on Figure 7.5 (right). The AIC consistently overestimates the test error though. Why do authors conclude that the results are “reasonable” then?

BIC

How is BIC motivated?
How does BIC relate to the Bayes factor?
How can we assess relative merit of models using the BIC?

Cross-validation

What is the ideal scenario for CV?
Why is it not always possible/feasible?
How can we get around it?
What is the trade off for chosing between large and small K?

Bootstrap

What the basic idea behind bootstrap?
How can we apply it to assessing model performance?
Why is the naive metric in (7.48) not ideal?
What is the alternative?
What is the idea behind the ‘.632’ estimator?

Questions

Next week…

Submit topic preferences

Please see the updated list of topics on the website
No later than Friday 26.04.2024
- If you don’t have a unique topic and a partner
  - Submit your preferences: https://forms.gle/97WgahvF5Ys26mdJ8
- If you have a unique topic and a partner
  - Send it to me by e-mail for approval

Plan for the next week

Read Shmueli (2010) before the next seminar for discussion
Confirmation of topics and groups

References

Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Tibshirani, R., & Friedman, J. (2009). Model assessment and selection. In The elements of statistical learning: data mining, inference, and prediction (pp. 219–259). Springer Series in Statistics. Springer, New York, NY. https://link.springer.com/chapter/10.1007/978-0-387-21606-5_7#preview

Shmueli, G. (2010). To Explain or to Predict? Statistical Science, 25(3), 289–310. https://doi.org/10.1214/10-STS330