Department of Electrical & Computer Engineering

University of Thessaly (Volos)


  1. The course deals with:
    Basic concepts and data management issues for big data in cloud computing environments


Dimitrios Katsaros

Suggested books

Title Hadoop: The Definitive Guide Data-Intensive Text Processing with MapReduce Cloudonomics: The Business Value of Cloud Computing
Authors Tom White Jimmy Lin and Chris Dyer Joe Weinman
Edition Fourth (April 2015) First (June 2010) First (October 2012)

Students will be evaluated by:

The lectures will start on September 29th, 2017 (Room 'Σ', Papastratos building, 18:00-21:30)

Papers which are tree-starred must be read.

Lectures' schedule

Week Date Subject Slides Links to related articles
1 29/09/2017 a) Introduction to cloud computing
b) Introduction to MapReduce and Hadoop
Lecture 1
Lecture 1b
Cloud Computing - A Primer: Part 1
Cloud Computing - A Primer: Part 2
Cloud Computing - A summary of issues
The datacenter as a computer (2nd ed.)
MapReduce: Simplified data processing on large clusters
Apache's Hadoop
Other processing engines:
a) Spark b) Storm c) Samza d) Flink
Quantitative comparison of Hadoop versus Spark
2 06/10/2017 Exercises on MapReduce
3 13/10/2017 α) Assignment of 1st set of exercises: Problem-set01
β) Hadoop MapReduce code for word counting
γ) Dynamo-style replication systems
δ) The CAP theorem
ε) Eventual consistency - Bounded Staleness for Partial Quorums
Lecture 2 Dynamo: Amazon's highly available key-value store***
Consistent Hashing
Fast, minimal memory, consistent hash algorithm***
Eric Brewer's keynote: Towards robust distributed systems
Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services***
IEEE Computer issue on the CAP Theorem
Probabilistically bounded staleness for practical partial quorums
Eventual consistent databases A survey
4 20/10/2017 α) Announcement of bonus Hadoop project: (bonus) Project
β) Virtualization - Virtual machine migration
Lecture 3 The architecture of VMs***
Virtual Machine Monitors
Rethinking the design of Virtual Machine Monitors
Virtual Machines
Live migration of virtual machines
Black-box and gray-box strategies for virtual machine migration
Server consolidation techniques
5 27/10/2017 Allocation for multiple resourse types Lecture 4 Dominant Resource Fairness***
6 03/11/2017 a) Exercises on DRF
b) Task scheduling for heterogeneous computing
Lecture 5 HEFT
7 10/11/2017 a) Assignment of 2nd set of exercises: Problem-set02
b) Exercises on DRF, HEFT
c) Indexing for clouds (background): Bloom filters, R-trees
Lecture 6
BF false positives
Bloom filter
Network applications of Bloom filters
Theory and practice of Bloom filters for distributed systems
The R-tree
8 24/11/2017 a) Indexing for clouds: A-tree
b) Cloud migration decisions I (Buy-or-Lease decisions I)
c) Assignment of 3rd set of exercises: Problem-set03
Lecture 7 The A-tree
The A-tree (complete description)***
To lease or buy CPUs?
9 01/12/2017 a) Cloud migration decisions II
b) Cloud migration decisions III
Lecture 8 To lease or not to lease from storage clouds?
On-premise or SaaS?***
10 07/12/2017
(Γ1 16:00-18:00. ΑΝΑΠΛΗΡΩΣΗ - αντί για 22/12)
a) Elasticity: The value of on-demand Lecture 9
11 08/12/2017 Memcached and its variants (MemC3, Memshare)
Lecture 10 MemCached
Cuckoo hashing
12 15/12/2017 Paxos and Raft: Distributed consensus Lecture 11
(εκτός εξεταστέας ύλης)
Two Phase Commit protocol for atomicity in Distributed DBMSs
Exam 10/01/2018 (Γ1 10:00-12:30) Final written exam

dkatsar AT inf DOT uth DOT gr
Τελευταία ενημέρωση: Τρι, 12 Δεκεμβρίου 2017