Lectures

The schedule below is tentative and subject to change, depending on time and class interests. We will move at a pace dictated by class discussions. Please check this page often for updates.

Week	Who	Topic
1	Itamar	Course Overview & Basic ML Concepts
2	Itamar	Reproducibiliy and Version Control
3	Ariel	Regression and K Nearest Neighbors
4	Ariel	Classification
5	Ariel	Dimensionality Reduction
6	Ariel	Decision Trees and Random Forests
7	Ariel	Unsupervised Learning
8	Itamar	Prediction in Aid of Estimation I - Lasso and Average Treatment Effects
9	Itamar	Prediction in Aid of Estimation II - Trees and Heterogeneous Treatment Effects
10	Itamar	Prediction Policy Problems
11	Itamar	Text Analysis

*Note: Ariels lectures (weeks3-7) are only available on Moodle.

Week 1

Notes: - Course Overview
- Basic ML Concepts

Selected references

Breiman, L. (2001). Statistical modeling: The two cultures. Statistical Science, 16(3), 199-231.

Athey, S. (2018). The impact of machine learning on economics. In The Economics of Artificial Intelligence: An Agenda. University of Chicago Press.

Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.

Week 2

Notes: Reproducibiliy and Version Control

DataCamp in-class:

Introduction to R
Working with the RStudio IDE (Part I)

Selected references

R and Tidyverse

R for Data Science (r4ds) by Garrett Grolemund and Hadley Wickham.
Data wrangling and tidying with the “Tidyverse” by Grant McDerrmot.
Getting used to R, RStudio, and R Markdown by Chester Ismay and Patrick C. Kennedy.
Data Visualiztion: A practical introduction by Kieran Healy.

Version Control

Happy Git and GitHub for the useR by Jenny Bryan.
Version Control with Git(Hub) by Grant McDerrmot.
Pro Git.

Week 8

Notes: ML in Aid of Estimation, Part I: Lasso and ATE

Selected references

Ahrens, A., Hansen, C. B., & Schaffer, M. E. (2019). lassopack: Model selection and prediction with regularized regression in Stata.

Belloni, A., D. Chen, V. Chernozhukov, and C. Hansen. 2012. Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain. Econometrica 80(6): 2369–2429.

Belloni, A., & Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models. Bernoulli, 19(2), 521–547.

Belloni, A., Chernozhukov, V., & Hansen, C. (2013). Inference on treatment effects after selection among high-dimensional controls. Review of Economic Studies, 81(2), 608–650.

Belloni, A., Chernozhukov, V., & Hansen, C. (2014). High-Dimensional Methods and Inference on Structural and Treatment Effects. Journal of Economic Perspectives, 28(2), 29–50.

Chernozhukov, V., Hansen, C., & Spindler, M. (2015). Post-selection and post-regularization inference in linear models with many controls and instruments. American Economic Review, 105(5), 486–490.

Chernozhukov, V., Hansen, C., & Spindler, M. (2016). hdm: High-Dimensional Metrics. The R Journal, 8(2), 185–199.

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., & Newey, W. (2017). Double/debiased/Neyman machine learning of treatment effects. American Economic Review, 107(5), 261–265.

Mullainathan, S. & Spiess, J., 2017. Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31(2), pp.87–106.

Van de Geer, S. A., & Bühlmann, P. (2009). On the conditions used to prove oracle results for the lasso. Electronic Journal of Statistics, 3, 1360–1392.

Zhao, P., & Yu, B. (2006). On Model Selection Consistency of Lasso. Journal of Machine Learning Research, 7, 2541–2563.

Week 9

Notes: ML in Aid of Estimation, Part II: Trees and CATE

Selected references

Athey, S., & Imbens, G. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27), 7353-7360.

Athey, S., Imbens, G. W., Kong, Y., & Ramachandra, V. (2016). An Introduction to Recursive Partitioning for Heterogeneous Causal Effects Estimation Using causalTree package. 1–15.

Davis, J.M. V & Heller, S.B., 2017. Using Causal Forests to Predict Treatment Heterogeneity : An Application to Summer Jobs. American Economic Review: Papers & Proceedings, 107(5), pp.546–550.

Lundberg, I., 2017. Causal forests: A tutorial in high-dimensional causal inference. https://scholar.princeton.edu/sites/default/files/bstewart/files/lundberg_methods_tutorial_reading_group_version.pdf

Wager, S., & Athey, S. (2018). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228–1242.

Week 10

Notes: Prediction Policy Problems

Selected references

Angwin, Julia, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. “Machine Bias.” ProPublica, May 23. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.

Athey, S. (2018). The Impact of Machine Learning on Economics.

Athey, S., & Wager, S. (2018). Efficient Policy Learning.

Kleinberg, B. J., Ludwig, J., Mullainathan, S., & Obermeyer, Z. (2015). Prediction Policy Problems. American Economic Review: Papers & Proceedings, 105(5), 491–495.

Kleinberg, B. J., Ludwig, J., Mullainathan, S., & Rambachan, A. (2018). Algorithmic Fairness. American Economic Review: Papers & Proceedings, 108, 22–27.

Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2018). Human Descisions and Machine Predictions. Quarterly Journal of Economics, 133(1), 237–293.

Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent Trade-Offs in the Fair Determination of Risk Scores. Proceedings of the 8th Conference on Innovation in Theoretical Computer Science, 43, 1–23.

Mullainathan, S., & Spiess, J. (2017). Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31(2), 87–106.

Week 11

Notes: Text Mining

Hands-on: Bank of Israel Minutes Text Analysis

Selected references

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993-1022.

Blei, D. M., & Lafferty, J. D. (2006, June). Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning (pp. 113-120). ACM.

Gentzkow, M., Kelly, B.T. and Taddy, M. (forthcoming). The Quarterly Journal of Economics.

Hansen, S., McMahon, M., & Prat, A. (2017). Transparency and Deliberation Within the FOMC: A Computational Linguistics Approach. The Quarterly Journal of Economics, 133(2), 801–870.

Lafferty, J. D., & Blei, D. M. (2006). Correlated topic models. In Advances in neural information processing systems (pp. 147-154).

Loughran, T. and McDonald, B., 2011. When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), pp.35-65.

Roberts, M.E., Stewart, B.M., Tingley, D., Lucas, C., Leder‐Luis, J., Gadarian, S.K., Albertson, B. and Rand, D.G., 2014. Structural topic models for open‐ended survey responses. American Journal of Political Science, 58(4), pp.1064-1082.

Roberts, M.E., Stewart, B.M. and Tingley, D., 2014. stm: R package for structural topic models. Journal of Statistical Software, 10(2), pp.1-40.

Assignments

Kaggle Competition

A website created by Itamar Caspi using RMarkdown.

Disclaimers: (1) The official syllabus and the content on the official Moodle website shall always prevail in case of any discrepancy or inconsistency between this website and its official HUJI versions; (2) This website and its content do not necessarily reflect the views of the Bank of Israel or any of its staff.