Model Comparison Seminar

TU Dortmund, Summer semester 2024

Today

  1. Discussion of Navarro (2019)
  2. Plan for the rest of the course

Discussion of Navarro (2019)

Feedback

  • What did you think about the paper?
  • Who found it interesting?
  • What was difficult to follow?
  • What was easy to follow?
  • Who learned something new from the paper?

Context

  • This paper drops you in the middle of a lively statistical debate
  • Rough timeline:
    • New advancements in Bayesian cross-validation (e.g., Vehtari et al., 2017)
    • Gronau & Wagenmakers (2019) publish an article about limitations of loo using toy examples
    • Vehtari et al. (2019) argue against the use of loo in the context of those examples
    • Navarro (2019) observes the debate and comments on it
  • Does Navarro (2019) sympathesise or support one of the set of authors?

Motivation

  • What is the aim of Navarro (2019) in this debate? E.g.
    • Who does she address?
    • What questions she wants to pose or answer?
    • What perspective she comes from?

Toy problems

  • What does Navarro (2019) think about solving simple toy problems constructed to highlight advantages or limitations of different methods?
  • What are the positive outcomes of using such examples?
  • What do you think?

Scientific and technological views

  • Navarro (2019) discusses differences between scientific and technological approaches to modeling
  • What are their respective goals?
  • How does this distinction compare to prediction vs explanation (Shmueli, 2010)? Is the scientific approach comparable to the explanatory modeling described by Shmueli (2010)?
  • What happens if we evaluate models based solely on their ability to make predictions?

Generalization

  • Difference between how Statistics define generalization and how generalization is defined in empirical sciences such as Psychology
  • Summarise the two different definitions
  • What definition do you remember from Hastie et al. (2009)?
  • What kind of generalization does Navarro (2019) find useful and why?
  • Why is Navarro (2019) sceptical of addressing generalization using statistical tools?

Tensions between statistical and scientific judgement

  • What is the point of the two figures discussed by Navarro (2019)?
  • Does Navarro (2019) think that statistical model selection criteria are useful in that context?
  • What does she think is an alternative?

Questions

Plans for the rest of the course

Plans

  • Final presentation dates: https://moodle.tu-dortmund.de/mod/data/view.php?id=1674369
  • May 13-June 3: Work on your topic
    • Voluntary attendance
    • Discussion and feedback
  • June 10-July 1: Presentations
    • Mandatory attendance
  • July 8: Course conclusion, Q&A
    • If everything goes well
    • Otherwise July 15 as planned initially
  • July 31: Submit your report

References

Gronau, Q. F., & Wagenmakers, E.-J. (2019). Limitations of Bayesian leave-one-out cross-validation for model selection. Computational Brain & Behavior, 2(1), 1–11.
Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Tibshirani, R., & Friedman, J. (2009). Model assessment and selection. In The elements of statistical learning: data mining, inference, and prediction (pp. 219–259). Springer Series in Statistics. Springer, New York, NY. https://link.springer.com/chapter/10.1007/978-0-387-21606-5_7#preview
Navarro, D. J. (2019). Between the devil and the deep blue sea: Tensions between scientific judgement and statistical model selection. Computational Brain & Behavior, 2(1), 28–34. https://doi.org/10.1007/s42113-018-0019-z
Shmueli, G. (2010). To Explain or to Predict? Statistical Science, 25(3), 289–310. https://doi.org/10.1214/10-STS330
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27, 1413–1432.
Vehtari, A., Simpson, D. P., Yao, Y., & Gelman, A. (2019). Limitations of Limitations of Bayesian leave-one-out cross-validation for model selection.” Computational Brain & Behavior, 2, 22–27.