Data Mining
[ECE-627] Fall 2007

Department of ECE, University of Tehran

Lecturer:

TAs: Mona Haraty, Syavash Nobarany, Hadi Amiri

Brief Outline  Learning Outcomes  Attendance  Recommended Text  Tutorial  Assessment  Web Resources  Additional Information


 

Brief Outline

Data that has relevance for managerial decisions is accumulating at an incredible rate due to a host of technological advances. Electronic data capture has become inexpensive and ubiquitous as a by-product of innovations such as the internet, e-commerce, electronic banking, point-of-sale devices, bar-code readers, and intelligent machines. Such data is often stored in data warehouses and data marts specifically intended for management decision support. Data mining is a rapidly growing field that is concerned with developing techniques to assist managers to make intelligent use of these repositories. A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, and stock market investments. The field of data mining has evolved from the disciplines of statistics and artificial intelligence.
This course will examine methods that have emerged from both fields and proven to be of value in recognizing patterns and making predictions from an applications perspective. We will survey applications and provide an opportunity for hands-on experimentation with algorithms for data mining using easy-to- use software and cases.

Some of the topics that would be discussed are as below:

  • Association Rules
  • Clustering
  • Dimensionality Reduction
  • SVM, Neural networks, Nearest Neighbour, Naïve bayes
  • Regression
  • Time Series
  • Graph based Mining
    • Fast frequent sub graph Mining
  • Web Mining
    • Spam Detection
    • Burst Detection
    • Topic Finder
    • Web Page Cleaning
    • Mining XML Query Patterns
  • Recommendation Systems
  • Business Cases
    • Market Basket Analysis
    • Web Click Stream Analysis
    • Customer Relationship management
    • Credit Scoring
    • Forecasting Television Audience

 

Learning Outcomes

A student who successfully completes this subject should be able to: develop an understanding of the strengths and limitations of popular data mining techniques and to be able to identify promising business applications of data mining. Students will be able to actively manage and participate in data mining projects executed by consultants or specialists in data mining. A useful takeaway from the course will be the ability to perform powerful data analysis in Excel.

 

Attendance

Students are expected to attend all classes and interact, answer questions, discuss issues and provide feedback. Satisfactory attendance is deemed to be attendance at approximately 80% of the allocated contact hours. However, attendance or class participation will not be part of the assessment for the subject.

 

Recommended Text

This subject introduces some imortant text for the course including:

  • Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 2000 (c) Morgan Kaufmann Publishers. (main resource)
  • Masoud Mohammadian (ed.), Intelligent Agents for Data Mining and Information Retrieval, ISBN:1591401941, Idea Group Publishing © 2004 (supplementary-few concepts)
  • Daniel T. Larose , Data Miningmethods And Models, John Wiley & Sons, Inc, 2006 (supplementary- few concepts)
  • Paolo Giudici, Applied Data Mining Statistical Methods for Business and Industry, Wiley,2005, ISBN: 10470-84678-X
  • Hand, Mannila, and Smyth.  Principles of Data Mining . Cambridge , MA : MIT Press, 2001. ISBN: 026208290X.
  • Berry and Linoff. Mastering Data Mining. New York , NY : Wiley, 2000. ISBN: 0471331236.
  • Delmater and Hancock. Data Mining Explained. New York , NY : Digital Press, 2001. ISBN: 1555582311.

Web Resources

List of some of most important research centres in IR will be given in class.

http://ece.ut.ac.ir/dbrg/hamshahri/ Hamshahri Collection:

http://gate.ac.uk/ A General Architecture for Text Engineering

http://www.glue.umd.edu/~oard/research.html#recommender Doug Orad's research page

http://www.cs.cmu.edu/~mccallum/bow/ Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering

http://www.virtualsalt.com/search.htm World Wide Web Research Tools

http://www.ccsu.edu/datamining/resources.html  Centeral Connecticut University

http://www.kdnuggets.com/index.html  KDnuggets Data Mining, Web Mining, Text Mining, and Knowledge Discovery

http://www2.sims.berkeley.edu/courses/is296a-4/f99/lectures.html SIMS 296a-4: Text Data Mining Schedule and Lectures, University of California at Berekly

http://ocw.mit.edu/OcwWeb/Sloan-School-of-Management/15-062Data-MiningSpring2003/CourseHome/index.htm  MIT OPen Courseware: Data Mining

XLMiner. Download a free version

 

Additional Information

Contact Information:

Contact Research Faculty: Dr. Farhad Oroumchian, oroumchianATacmDOTorg

TAs: m.haratyATeceDOTutDOTacDOTir, s.nobaranyATeceDOTutDOTacDOTir, h.amiriATeceDOTutDOTacDOTir


 

Assignments

Assignment # 1

You can find the data collections needed for the assignment from this website:
http://www.dataminingconsultant.com/DMMM.htm

Download the related files from here.

Other assignment will be announced.


Designed By Hadi Amiri
Data Mining, Department of ECE, University Of Tehran , Fall 2007, Tehran, Iran.