**CIS 664: KNOWLEDGE DISCOVERY AND DATA MINING**

*Spring 2004 *

Time: Tuesday, 4:40-7:20pm; Place: Tuttleman 305 A

**Instructor: Zoran Obradovic**

303 Wachman Hall, zoran@ist.temple.edu, phone: 215 204 6094

**Office hours:** Tuesday 2-4pm and by appointment

**Goals:**

The objective of knowledge discovery and data mining process is to extract nontrivial, implicit, previously unknown, and potentially useful information from massive datasets. The course is intended to serve as an introduction to the fundamental techniques required to support this process. The course is structures to provide ample opportunity for participants to learn about a growing new research area, and scout around for promising research topics by a hands-on experience.

**Prerequisites:**

Basic knowledge in Data Base Systems (CIS 616); programming skills in C or C++; basic statistics, graph theory, and linear algebra.

**Texts:**

Han J. and Kamber M.: *Data Mining: Concepts and Techniques*, Morgan Kaufmann Publishers, 2001, ISBN 1-55860-489-8 (*required*);

Hand D., Mannila H. and Smyth P.: *Principles of Data Mining,* The MIT Press, 2001, ISBN 0-268-08290 (*optional*).

**Topics:**will be tailored to interests of the participants. Content will include:

I. An overview of data mining tasks and techniques;

II. Data preprocessing:

(1) data cleaning and transformation,

(2) data reduction.

III. Core data modeling topics:

(1) model functions (classification, regression, clustering, summarization, sequence analysis, outliers analysis),

(2) model representation (decision trees, Bayesian belief networks, neural networks, density models, partitioning, hierarchical, density-based, grid-based and model-based clustering algorithms, Apriori algorithm, correlation analysis).

IV. Advanced topics

(1) spatial data mining,

(2) temporal and sequence data mining,

(3) text and web mining,

V. Reading and research projects presentations.

**Grading: **Homework, reading assignments and an individual research project.