Course Code

ΜΔΑ-240

Semester

2nd Semester

ECTS Credits

7,5

Type of Course

Mandatory

Faculty

Christos Doulkeridis

Α. Vlachou

Big Data and Analytics ΙΙ: Techniques and Tools

Objective

The main objective of this course is (a) to acquaint the students with the latest trends in Big Data management, and (b) to study advanced data analytics techniques focusing on do-mains, such as text, recommendation, visualizations, and graphs. As expected results, the students will gain deep insight of solutions for real-world Big Data management problems, while obtaining strong skills in designing and implementing such scalable solutions. Moreo-ver, students are expected to apply analytical processing and data analysis techniques in practical problems related to modern data management.

 

Course Contents

Open data and Linked open data

The case for Open data. Open data repositories. Open governmental data. Open geospatial data. Linked data. Linked open data.

Big Spatial Data

The value of spatial data in modern applications. Geotagging. Plat-forms for scalable management of Big Spatial Data. SpatialHadoop. Spatial exten-sions for Spark. SpatialSpark.

Trends in Big Data management

Modern techniques in Big Data management. “One size does not fit all”. Data exploration. In-memory processing. In-situ process-ing. Novel platforms. Polystores.

The Industry’s view on Big Data

Novel architectures for Big Data management from the industrial sector. Google (incl. Pregel, Dremel, Giraph, F1). Facebook. Twitter. LinkedIn (Kafka). SAP (HANA).

The case of NoDB

Minimize data-to-query time. Database processing on raw data files. Avoiding the bottleneck of database design and data loading. Specific applica-tions, such as scientific data analysis.

Multidimensional analytics

Data warehouses. Cubes. Multidimensional access methods. Dimensionality reduction. Multidimensional analysis.

Text analytics

Management of unstructured data. Challenges related to text man-agement. Text indexing and search. Scoring. Term weighting. Vector space model. Computing scores in a complete search system. ElasticSearch.

Social media analytics

Social media monitoring. Collecting data from social media. Understanding social data. Analysing social data. The value of social data for today’s business. Trend detection. Business intelligence and social media.

Visual analytics

Data visualization tools and techniques. Visual data analytics. The Tableau tool. Applications in business intelligence. New visual interfaces. Advanced visualization techniques. Research prototype systems.

Graph analytics

Novel data management systems and techniques for processing large-scale graphs. Graph indexing. Parallel and distributed graph processing. Graph partitioning.

Recommended Readings

  • Samet H. (2006): Foundations of Multidimensional and Metric Data Structures. The Morgan Kaufmann Series in Computer Graphics, ISBN-10: 0123694469.
  • Leskovec J, Rajaraman A, Ullman J: Mining of Massive Datasets. Cambridge Univer-sity Press.
  • Mamoulis N. (2011): Spatial Data Management. Morgan & Claypool Publishers, 2011 Synthesis Lectures on Data Management.

Additional Readings

  • White R.W, and Roth R.A: Exploratory Search Beyond the Query-Response Paradigm. Morgan Claypool. ISBN: 9781598297836.
  • Selected research articles.