CISC 5800: Machine Learning



Class times: Wednesday, 4:00 – 6:00pm, Lowenstein (LL) 306
Instructor: Prof. Daniel D. Leeds (my homepage)
Office: LL 815D (for Office Hours, starting Sept 16), JMH 328A (Bronx, non-office hour times)
E-mail:
Office hours: Wednesday 3 – 4 and by appointment (I may add Tuesday 4-5 as office hours as well depending on demand)

Full syllabus will be available here.

Course text: No book is required, but the following two will be useful as references.


Matlab: We will complete our programming assignments in Matlab. There are several ways to use Matlab/Matlab-equivalent-software:
Pre-requisites: It is essential you have a base level of computer science and math background to be able to succeed in this class.

Sections below:
  1. Resources
  2. Announcements
  3. Slides
  4. Assignments
  5. Practice
  6. Answers

Resources:
Computing guides
Linux Commands - important Linux commands for working on storm
vi Commands - important commands for the vi text editor; you are welcome to use emacs instead of vi
A Guide to Putty - Information for Windows users on accessing storm
Extra background on Matlab


Announcements:
December 11, 2:45pm I am holding office hours December 14 at my Lincoln Center location 1-3pm.

December 6, 7:00pm I am holding office hours December 7 at my Lincoln Center location 4-5pm.

December 3, 2:40pm Our final class will be Wednesday, December 9. Most of the class is scheduled to be a review session. Come with any questions you have. I also plan to hold extra office hours at Lincoln Center Monday afternoons Dec 7 and Dec 14. Please e-mail me if you'd like to come and tell me what times you are available -- I will try to schedule my hours to accomodate the most students.

November 21, 9:40pm The final exam will be December 16 at 4pm.

November 4, 9:15am Office hours today will be after class, 6:30-7:30pm.


Slides:
Lecture 1, Introduction + probability, calculus, Matlab programming. [pptx version]
Supplementary Linux lecture
Lecture 2, Bayesian classifiers + Logistic Regression, A few clarifying updates made night of Sept 30! [pptx version]
September 23 guest lecture, Robotics, vision, and mapping; related papers: IROS, ICRA
Lecture 3, Discriminative classifiers; in-class (October 7) correction now added [pptx version]
Lecture 3B - October 14, Further lecture on discriminative classifiers for October 14 [pptx version]
Lecture 4 - November 4-11, Dimensionality reduction, Lecture 4, updated [pptx version]
Lecture 5, Hidden Markov Models, last update Dec 5, 6:20pm (M step corrected on Dec 4, β definition defined for βT-1 on Dec 5) [pptx version]
For another perspective on EM in HMMs (also called "Baum Welch" when used with HMMs), check out the first 3 pages of these Stanford lecture notes (ai,j in their notes is Aj,i in my notes and b in their notes is Φ in my notes), or page 618 and onward in the Bishop text, or page 608 and onward in the Murphy text.
Lecture 6, Bayes Nets and more EM, updated Dec 4, 4pm [pptx version]


Assignments:
Homework 0 - due September 16. I recommend you do it by September 9! This is largely to test your background knowledge for the course, though it covers a small amount of the calculus we learned on September 2. Homework 0 has now been corrected to remove the calculus questions. You should find at least 85% of the remaining questions to be review of material you have previously known — otherwise, you should consider taking other courses.
HW0 instructions updated for coding questions 2 and 3 -- the questions have been clarified, but the basic instructions remain the same.
My answers to HW 0

Ungraded Matlab assignment - I highly recommend you complete this assignment this week (Sept 21-Sept 25). The questions are on pages 1 and 3, extra background is on page 2.

My answers
newFunc.m - file for download
sampleData.mat - file for download


Homework 1 - due September 30. The version has been updated as of Sept 24, 6:20pm to make one question easier (question A1), fix one typo (in question A6), and provide a few clarifications (in part B - page 2 AND 3)
hw1data.mat
hw1dataTest.mat — new, improved matrix of testing data

My answers to HW 1

Homework 2 - written part due October 14, programming part due October 16.
hw2data.mat
Note: As this is a real data set, and we are using a relatively simple learning method, your maximum classification performance may not be as impressive as you expected (e.g., under 70% correct).
Note: For questions 7-9, you should assume each rumble value measured is an independent sample from the probability distribution of its respective class. (This note is intended for mathematical rigor. If you feel you understood questions 7-9 better without this clarifying note, you may disregard this note.)
Another note: For question 7, there is a single answer to express the likelihood p(D|μ1234) - in other words, the one answer incorporates the data from all four classes within it.
Part C note: For logistic-regression based classification, you will classify your data as class 1 if p(yi=1|xi;θ)>p(yi=0|xi;θ), and otherwise classify the data as class 2

My answers to HW 2 - several alternate valid solutions to part C questions 3 and 4
If you add your Part C score to your on-paper (Part A+B score), here is the letter break-down
86-96.5 "A range"
75-86 "B range"
64-75 "C range"


Final project instructions - due Dec 13.
Some frequently asked questions for the projects
Extra pointers on the final project, discussed in class November 18.

Homework 3 - Parts A and B due November 18; Parts C due November 20; Revised night of November 11 to remove Hidden Markov Models and to expand in Part A; Correction to SVM kernel question made October 12
svmData.mat
letter-recognition.mat
My answers to HW 3
51-57 "A range"
44.5-51 "B range"
38-44.5 "C range"
31.5-38 "D range"


Homework 4 - Now OPTIONAL. If you choose to submit, it is due December 9. A few corrections made Sunday, Dec 6, 12:30pm, shown in red
hw4Data.mat
Note: Grade for HW4 will only be incorporated into your final grade if it benefits you.
My answers to HW 4 (both written and coding parts!)

Practice:
Practice midterm questions available
here. I have made a few updates October 15 evening, which are written in bold.
Here are my practice answers.

Here is a review sheet of various relevant formulas and topics, to help in your studying.

Final practice
Practice final questions available here.
Here are my practice answers, one correction added regarding reconstruction error on evening of Dec 14.


Answers:
Midterm answers