Code

ΜΔΑ-283

Semester

1st

ECTS

7,5

E-Services

Category

Obligatory

Objective

The ability to collect and store data has increased significantly as a result of innovation in various areas, such as the internet, e-commerce, electronic transactions, bar-code readers, mobile devices and intelligent machines. Data mining is a rapidly growing field that deals with the development of techniques that aim to help data owners make intelligent use of these collections.

In the context of this course, we study methods that help in the selection and preparation of data before the application of analysis and knowledge mining techniques. Also, the basic techniques used to extract useful knowledge patterns from large data collections are presented. Techniques related to the analysis of various types of data including text, data from the World Wide Web and social networks are studied. Through this course, students are expected to acquire significant technical skills in data analysis and become familiar with algorithms and knowledge mining methods.

After successfully completing the course, students will be able to:

  • assess the quality of the data to be analyzed and apply the necessary data preparation techniques
  • choose the appropriate data mining technique based on the requirements and data types
  • apply data mining techniques
  • use appropriate techniques and tools to extract knowledge from data collections
  • to evaluate the quality of data mining results

Learning outcomes

  • Search for, analysis and synthesis of data and information, with the use of the necessary technology
  • Adapting to new situations
  • Decision-making
  • Working independently
  • Production of new research ideas
  • Project planning and management
  • Criticism and self-criticism

Syllabus

  • Basic concepts in data mining and data preparation

    Requirements and review of basic data mining tasks. Data cleaning, transformation. Measures of similarity, distance. Summary of analytical forecasting methods.

  • Clustering

    Introduction to basic clustering algorithms for large databases. Spectral clustering methods. Separative-hierarchical clustering. Clustering of non-linearly separable data. Fuzzy clustering

    Techniques for evaluating clustering results.

  • Classification

    Basic types of categorization. Statistical classification. Discriminant function analysis. Support vector machines. Evaluation criteria for categorization methods. Cross-classifications analysis. Typical applications.

  • Dimensional reduction techniques

    The problem of many dimensions. Presentation of basic dimensionality reduction techniques (PCA, SVD).

  • Association rules, frequently occurring sets of objects

    Apriori algorithm, comparison of algorithms, representative correlation rules.

  • Link Analysis

    Hyperlink analysis topics, Page ranking algorithms, Hubs and authorities (HITS).

  • Social network analysis

    Network modeling, graph metrics (degree, betweenness centrality, connected components), clustering coefficient.

  • Extract communities from graphs

    Introduction to the basic concepts of clustering on graph data. Basic techniques for extracting communities from graphs.

  • Text mining

    Text representation model, similarity measures, predictive models for text, clustering techniques.

  • Recommendation generating systems

    Content-based systems, collaborative filtering systems, personalization, knowledge mining techniques for large-scale recommender systems, evaluation of recommender systems, applications of recommender systems.

Bibliography