Machine Learning Fall 2019
 
 
The Sackler Institute of Graduate Biomedical Sciences at NYU School of Medicine
 
Machine Learning Fall 2019 (BMSC-GA 4439 and BMIN-GA 1004)

Course Directors:
David Fenyö (David@FenyoLab.org)
Wenke Liu (Wenke.Liu@nyulangone.org)

Teaching Assistants:
Anna Yeaton (Anna.Yeaton@nyulangone.org)
Sonali Narang (Sonali.Narang@nyulangone.org)

Learning objectives

The student will learn and understand the most commonly used machine learning methods.

Course Material

Required Reading:

  • Introduction to Statistical Learning: with Applications in R. James G, Witten D, Hastie T, Tibshirani R. Springer 2013.
  • Applied Predictive Modeling by Max Kuhn & Kjell Johnson, Springer 2013.
Recommended Reading:
  • Pattern Classification, 2nd Edition,Richard O. Duda, Peter E. Hart, David G. Stork, ISBN: 978-0-471-05669-0
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Hastie T, Tibshirani R, Friedman J. Springer: 2011.
  • Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher Bishop (Author) ISBN-10: 0387310738

General Policies

Late/missed work: You must adhere to the due dates for all required submissions. If you miss a deadline, then you will not get credit for that assignment/post.

Incompletes: No "Incompletes" will be assigned for this course unless we are at the very end of the course and you have an emergency.

Responding to Messages: I will check e-mails daily during the week, and I will respond to course related questions within 48 hours.

Announcements: I will make announcements throughout the semester by e-mail.

Make sure that your email address is updated; otherwise you may miss important emails from me.

Safeguards: Always back up your work on a safe place (electronic file with a backup is recommended) and make a hard copy. Do not wait for the last minute to do your work. Allow time for deadlines.

Plagiarism: Plagiarism, the presentation of someone else's words or ideas as your own, is a serious offense and will not be tolerated in this class. The first time you plagiarize someone else's work, you will receive a zero for that assignment. The second time you plagiarize, you will fail the course with a notation of academic dishonesty on your official record.

Course Assessment

  • Weekly Problem Sets (50%)
  • Discussions (20%)
  • Final Project (30%)
Lectures

Lecture 1 Course Overview (September 3, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
Lecturer: David Fenyo (Slides)

Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 1-2
  • Applied Predictive Modeling by Kuhn & Johnson, Chapters 1-4
  • DREAM Challenges

    Additional Reading
  • Coursera: Machine Learning


    Lecture 2 Unsupervised Learning: Clustering (September 5, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wenke Liu (Slides)
    Tutorial Instructor: Anna Yeaton

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 10
  • The Elements of Statistical Learning by Hastie et al. Chapter 14

    Additional Reading
  • Cluster analysis (5ed). Everitt BS, Landau S, Leese M, Stahl D. Wiley: 2011.


    Lecture 3 Unsupervised Learning: Dimension Reduction (September 10, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wenke Liu (Slides)


    Lecture 4 Student Project Plan Presentation (September 12, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)


    Lecture 5 Supervised Learning: Regression (September 19, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wilson McKerrow (Slides)
    Tutorial Instructor: Anna Yeaton (RegressionExamples.R )

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 3
  • Applied Predictive Modeling by Kuhn & Johnson, Chapters 5-6

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 3


    Lecture 6 Supervised Learning: Classification (September 24, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Anna Yeaton (Slides)

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 4
  • Applied Predictive Modeling by Kuhn & Johnson, Chapters 11-12

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 4


    Lecture 7 Supervised Learning: Performance Estimation & Regularization (September 26, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Anna Yeaton (Slides)
    Tutorial Instructor: Anna Yeaton

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapters 5 & 6

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 7


    Lecture 8 Expectation Maximization (October 1, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wilson McKerrow (Slides)
    Tutorial Instructor: Wilson McKerrow (EMclass_examples.R )


    Lecture 9 Feature selection (October 3, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Zhi Li (Slides)

    Reading List
  • Applied Predictive Modeling by Kuhn & Johnson, Chapters 18-19


    Lecture 10 Student Project Exploratory Data Analysis Presentation (October 8, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)


    Lecture 11 Student Project Exploratory Data Analysis Presentation (October 10, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)


    Lecture 12 Tree-Based Methods (October 22, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wenke Liu (Slides)

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 8
  • Applied Predictive Modeling by Kuhn & Johnson, Chapters 8 & 14
  • Carter H, Chen S, Isik L, et al. Cancer-specific High-throughput Annotation of Somatic Mutations: computational prediction of driver missense mutations. Cancer research. 2009
  • Waks Z, Weissbrod O, Carmeli B, Norel R, Utro F, Goldschmidt Y. Driver gene classification reveals a substantial overrepresentation of tumor suppressors among very large chromatin-regulating proteins. Scientific Reports. 2016


    Lecture 13 Support Vector Machines (October 24, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wenke Liu (Slides)
    Tutorial Instructor: Sonali Narang

    Reading List
  • An Introduction to Statistical Learning by Gareth James et al. Chapter 9
  • Hyeran Byun and Seong-Whan Lee, Applications of Support Vector Machines for Pattern Recognition: A Survey, SVM 2002, LNCS 2388, pp. 213-236, 2002.
  • Mao Y, Chen H, Liang H, Meric-Bernstam F, Mills GB, Chen K. CanDrA: Cancer-Specific Driver Missense Mutation Annotation with Optimized Features. Adamovic T, ed. PLoS ONE. 2013

    Additional Reading
  • The Elements of Statistical Learning by Hastie et al. Chapter 10


    Lecture 14 Markov Models (October 29, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Wilson McKerrow (Slides)
    Tutorial Instructor: Sonali Narang (kitten_markov_chain_example.R )


    Lecture 15 Neural Networks (October 31, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: David Fenyo (Slides)

    Reading List
  • The Elements of Statistical Learning by Hastie et al. Chapter 11
  • Alipanahi, Babak, et al. "Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning." Nature biotechnology 33.8 (2015): 831-838.
  • Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
  • Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

    Additional Reading
  • Neural Networks and Deep Learning by Michael Nielsen


    Lecture 16 Machine Learning Applied to Healthcare (November 5, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Narges Razavian

    Reading List
  • Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet. 16 (2015) 321-32.
  • Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 46 (2014) 310-5.


    Lecture 17 Machine Learning Applied to Text Data (November 7, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Stephen Johnson (Slides)


    Lecture 18 Machine Learning Applied to Omics Data (November 12, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)
    Lecturer: Kelly Ruggles
    Tutorial Instructor: Sonali Narang


    Lecture 19 Student Project Presentation (December 10, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 5:30-7pm)


    Lecture 20 Student Project Presentation (December 12, 2019 Science Building, 435 East 30th St, 7th Floor, Room 720 3-4:30pm)