Course Code

ΠΠΣ-130

Semester

1st Semester

ECTS Credits

7,5

Type of Course

Mandatory

Data Management

Objective

The main objective of the course is to train students in methodologies and technologies for data management. This course covers advanced topics related to database design and query processing in modern data management architectures in the context of broader network-centric systems and services. It also examines issues of business intelligence, including business data integration, business process modeling, and advanced mining techniques from business data. The expected learning outcomes of the course include the ability of students to effectively develop traditional and network-centric systems and services in database environments including structured, semi-structured and unstructured data. Also, students acquire basic knowledge and skills in data analysis and extraction of useful information.

 

Course Contents

Distributed data management

Basic concepts, problems, architectures, distributed query processing, peer-to-peer data management systems, unstructured and structured peer-to-peer networks.

Parallel data management

Fundamental concepts and architecture of parallel databases, parallel query processing, data management in the cloud, the MapReduce programming model, the Hadoop implementation, HDFS.

Query processing and optimization

Rank-aware query processing, rank-join query processing, algorithms for rank-aware query processing, skyline queries, algorithms for processing skyline queries.

Dimensionality reduction – Feature selection

Multidimensional data, modeling, problems of many dimensions (“the curse of dimensionality”, “the empty space phenomenon”), failure of indexing methods, dimensionality reduction algorithms, application in practical problems in data management.

Security and privacy issues

Authentication, access control, security policies, users roles (model RBAC), the problem of publishing anonymized data, k-anonymity, l-diversity, privacy-enforcing mechanisms.

Business intelligence

Basic concepts, an industry viewpoint on business intelligence, new trends (Big Data, fast business, better software), business process modeling.

Information integration in business intelligence – Data preprocessing

Data selection, data cleaning, handling missing values, data integration, semantic heterogeneity, data visualization for decision support.

Object similarity

Distance measures/similarity measures for different data types (numerical, categorical, text), processing similarity queries (range queries and k-nearest neighbor queries), applications in machine learning.

Data warehouses

Multidimensional data model, architecture of data warehouses, design of data warehouses, extract-transform-load (ETL), OLAP operations, data warehouses as tools for business intelligence.

Data mining and text analysis

Basic data mining techniques and application to business intelligence, information extraction techniques from diverse data sources (text, Web, social networks).

Recommended Readings

  • Teorey T. J., Lightstone S. S., Nadeau T. and Jagadish H.V. (2011): Database Modeling and Design, Fifth Edition: Logical Design, Morgan Kaufmann, ISBN-10: 0123820200.
  • Teorey T. J. (1998): Database Modeling & Design: The Fundamental Principles, Morgan Kaufmann, ISBN-10: 1558602941.
  • Siau K. (2007): Contemporary Issues in Database Design and Information Systems Development, IGI Publishing, ISBN-10: 1599042894.
  • Raymond T.Ng et al. (2013): Perspectives on Business Intelligence. Morgan & Claypool Publishers. Synthesis Lectures on Data Management.

Additional Readings

  • Han J. and Kamber M. (2006): Data Mining-Concepts and Techniques. Morgan Kaufmann, ISBN 1-55860-901-6.
  • Vazirgiannis, M., Halkidi, M. and Gunopoulos, D. (2003): Quality Assessment and Uncertainty Handling in Data Mining, Springer Verlag, LNAI Series, ISBN-10: 1852336552.
  • Chakrabarti S. (2002): Mining the Web, Discovering Knowledge from Hypertext Data, Morgan Kaufman Publishers, ISBN-10: 1558607544.
  • Chaudhuri, S, Dayal, U., Narasayya, V. (2011): An Overview of Business Intelligence Technology. Communications of the ACM, Vol. 54 No. 8, Pages 88-98.