  1. The course deals with:
    Basic concepts and data management issues for big data in cloud computing environments


Suggested books

Title Hadoop: The Definitive Guide Data-Intensive Text Processing with MapReduce Cloudonomics: The Business Value of Cloud Computing
Authors Tom White Jimmy Lin and Chris Dyer Joe Weinman
Edition Fourth (April 2015) First (June 2010) First (October 2012)

Students will be evaluated by:

The lectures will start the week starting on September 26th, 2016

Lectures' schedule

Week Date Subject Slides Links to related articles
1 30/09/2016 a) Introduction to cloud computing
b) Introduction to MapReduce and Hadoop
Lecture 1
Lecture 1b
Cloud Computing - A Primer: Part 1
Cloud Computing - A Primer: Part 2
Cloud Computing - A summary of issues
The datacenter as a computer (2nd ed.)
MapReduce: Simplified data processing on large clusters
Apache's Hadoop
Other processing engines:
a) Spark b) Storm c) Samza d) Flink
Quantitative comparison of Hadoop versus Spark
2 07/10/2016 a) Assignment of 1st set of exercises: Problem-set01
b) Exercises on MapReduce
Hadoop MapReduce code for word counting
Installation procedure
3 14/10/2016 Dynamo-style replication systems Lecture 2 Dynamo: Amazon's highly available key-value store***
Consistent Hashing
Fast, minimal memory, consistent hash algorithm***
4 21/10/2016 The CAP theorem Lecture 3 Eric Brewer's keynote: Towards robust distributed systems
Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services***
IEEE Computer issue on the CAP Theorem
5 04/11/2016 Eventual consistency - Bounded Staleness for Partial Quorums Lecture 4 Eventually consistent
Probabilistically bounded staleness for practical partial quorums
Eventual consistent databases A survey
6 07/11/2016
Virtualization - Virtual machine migration Lecture 5 The architecture of VMs***
Virtual Machine Monitors
Rethinking the design of Virtual Machine Monitors
Virtual Machines
Live migration of virtual machines
Black-box and gray-box strategies for virtual machine migration
Server consolidation techniques
7 11/11/2016 a) Assignment of 2nd set of exercises: Problem-set02
b) Allocation for multiple resourse types
c) Exercises on VM migration, DRF
Lecture 6 Dominant Resource Fairness***
8 18/11/2016 a) Exercises on DRF
b) Task scheduling for heterogeneous computing
9 25/11/2016 a) Exercises on HEFT
b) Indexing for clouds (background): Bloom filters
Lecture 7
BF false positives
Bloom filter
Network applications of Bloom filters
Theory and practice of Bloom filters for distributed systems
10 02/12/2016 a) Indexing for clouds (background): R-tree
b) Indexing for clouds: A-tree
Lecture 8 The R-tree
The A-tree***
11 09/12/2016 a) Cloud migration decisions
b) The value of "on-demand"
Lecture 9 To lease or not to lease from storage clouds?
On-premise or SaaS?***
To lease or buy CPUs?
Time is money
12 16/12/2016 a) Memcached and its variants
b) Assignment of 3rd set of exercises: Problem-set03
Lecture 10 Memcached
Cuckoo hashing
Exam 09/01/2017 (10:00) Final written exam

