Fall 2017 CISC4631: Data Mining
Click for Class Schedule
General Information
Instructor: | Dr. Yijun Zhao |
Email: | yzhao@cis.fordham.edu |
Lecture Time: | Tuesday & Friday 10:00am - 11:15am Leon Lowenstein Bldg. 520 |
Office Hours (primary): | Friday 4:00pm - 5:30pm Leon Lowenstein Bldg. 305 |
Office Hours (secondary): | Thursday 4:00pm - 5:00pm Leon Lowenstein Bldg. 812 |
Other Q&A Resource: | peer-based Q&A available via Piazza. Signup link: http://piazza.com/fordham/fall2017/cisc4631l01 |
Textbook
Jiawei Han, Micheline Kamber, and Jian Pei.,
Data Mining: Concepts and Techniques, 3rd edition, Morgan Kaufmann, 2011
Recommended books for further reading:
Learning from Data by Yaser S. Abu-Mostafa, Malik Magdon-Ismail and Hsuan-Tien Lin, AMLBook (2012)
Pattern Recognition and Machine Learning by Christopher M. Bishop, Springer (2011)
Note:
- All 3 books are on reserve for CISC4631 at Quinn Library with 2 hours loans.
- Additional readings will be distributed through out the course.
Description of Course
This course introduces concepts, algorithms, and techniques of data mining as well as
the practical issues that arise when applying these algorithms to real-world problems. The students
will learn various aspects of data mining, including classification, prediction, ensemble methods,
association rules, sequence mining, time series mining and cluster analysis. The homework assignments consist of both
theory (written) and programming components.
The class project involves building a predictive model using a real-world large data sets.
Prerequisites
The students are expected to:
- Have knowledge in data structures, algorithms, basic linear algebra, and basic statistics.
- Familiar with at least one programming language, and have programming experiences.
Course Outline (Topical):
- Matrix Data
- Linear Regression
- Classification: KNN, Decision Tree, SVM, Naive Bayes, Logistic Regression
- Clustering: Hierachical Clustering, K-means, K-mediods, DBSCAN
- Set Data
- Sequence Data
- Time Series
Homework
-
There will be six HW assignments.
-
Each assignment's due date is posted on the schedule.
- Each assignment is graded with a scale of 100.
Course Project
There will be a course project to predict an individual's income level using data extracted from the census bureau database.
The project can be completed either individually or as a group (strongly recommended). A group can have at most 4 people.
More details will be discussed in class.
Exams
- There will be one midterm exam.
-
There will be a final exam.
The final exam is cumulative with emphasis on the material covered after midterm.
Grading
- Course projects: 20%
- Homework: 40%
- Midterm exam: 20%
- Final exam: 20%
  Note:
- Failing to complete a HW or the course project on time
will cause a 10% reduction for each additional day.
- Dispute on grading must be resolved within two weeks after receiving your score.
Additional Remarks
- Academic Honesty
All work produced in this course should be your own unless it is specifically stated that you may work with others.
You may discuss the homework problems with other students generally, but may not provide complete solutions
to one another; copying of homework solutions is always unacceptable and will be considered a violation of
Fordham's academic integrity policy. Violations of this policy will be handled in accordance with university
policy which can include automatic failure of the assignment and/or failure of the course.
For more information, please refer to the Academic Integrity website.
- Makeup Exam
There will be no make-up exams given after the exam date. If you know in advance that you will have to miss an exam, you must check with me (in advance) to avoid getting a zero for that exam. In case of illness on an exam date, please contact me as soon as possible, so that appropriate arrangements can be made.
Last modified: Aug. 28, 2017