[08.10.2012]

Lecture 184.237 (2.0 hrs) winter term 2012/2013

Distributed Systems (English)

This is the homepage of the Distributed Systems lecture in English. If you were searching for the German lecture, you have to go to there.

Please note the information in TISS, in particular the mandatory registration for the group "English lecture". This course is primarily intended for PhD and international (e.g., Erasmus) students, others are welcome subject to (rather limited) availability.

There is an accompanying lab, see Distributed Systems Lab (Verteilte Systeme LU).

[Schedule] [Lecture notes] [Grading] [Preparation and Excercises] [Contact]

Schedule

The lecture takes place in October and November, the following table contains the precise schedule and contents:

Date Time Place Content
Mi, 10.Oct 14:45-16:15 EI1 Lecture overview, foundations of distributed systems
Mo, 15.Oct 16:15-19:45 EI1 Communication (1) and Communication (2)
Mi, 24.Oct 14:45-18:00 EI1 Operating System Support and Naming and Discovery
Mo, 29.Oct 16:15-19:45 EI1 Clocks and Agreement and Consistency and Replication
Mi, 31.Oct 14:45-16:30 EI1 Security
Mo, 05.Nov 16:15-19:45 EI1 Dependability and Fault Tolerance, Technology Overview and lecture summary

Date	Time	Place	Content
Mi, 10.Oct	14:45-16:15	EI1	Lecture overview, foundations of distributed systems
Mo, 15.Oct	16:15-19:45	EI1	Communication (1) and Communication (2)
Mi, 24.Oct	14:45-18:00	EI1	Operating System Support and Naming and Discovery
Mo, 29.Oct	16:15-19:45	EI1	Clocks and Agreement and Consistency and Replication
Mi, 31.Oct	14:45-16:30	EI1	Security
Mo, 05.Nov	16:15-19:45	EI1	Dependability and Fault Tolerance, Technology Overview and lecture summary

Lecture notes

The chapter, section, and page references below refer to the 2nd edition of Distributed Systems by A. Tanenbaum and M. van Steen (2006), which can be purchased, e.g., from Amazon.com. Please note the corrections as well - in particular pp. 332 and 333. The European International Edition (2008) is identical and can be purchased, e.g., from Amazon.de.

Lecture slides are available. Please note, the slides alone are not sufficient, typically you need the book.

For deeper understandig, refer to the original papers. Other good distributed systems books are:

Coulouris, J. Dollimore, T. Kindberg. Distributed Systems: Concepts and Design, 4th ed., ISBN 0321-263-545, Addison-Wesley, 2005
K. P. Birman, Reliable Distributed Systems, Springer, 2005, ISBN: 0-387-21509-3
G. Alonso et al, Web Services, Springer, 2003, ISBN: 978-3540440086

Grading

There is no written or oral exam. Instead, you have to prepare for each lecture several of the questions below. Before the lecture starts, you are asked to mark the questions you are able to present. Then, some of the students are chosen to present the questions. Afterwards the lecturer will explain further and answer questions.

Please note, student presentations start in the very first lecture already, hence some preparation is needed already at the beginning.

These presentations receive a score of 0-100% according to quality. For a certain grade, the average score of the best two presentations has to be above a certain threshold and the number of questions marked has to be above a minimum number. The following table shows the resulting total grade:

Average score of
best two presentations minimum number of
questions marked grade
91% or better 45 Sehr Gut (1)
80% or better 40 Gut (2)
65% or better 30 Befriedigend (3)
50% or better 25 Genügend (4)
other other not passed

Average score of best two presentations	minimum number of questions marked	grade
91% or better	45	Sehr Gut (1)
80% or better	40	Gut (2)
65% or better	30	Befriedigend (3)
50% or better	25	Genügend (4)
other	other	not passed

Preparation and Excercises

Foundations of distributed systems

Book: Chp. 1 and 2 till 2.2 (incl.)

Questions:

Provide a definition for "distributed system". Name important design goals and characteristic properties of distributed systems. What are the fallacies and pitfalls?
Describe the role of transparency in distributed systems, the critical aspects, and the different kinds of transparencies.
What is "Openness"?
Explain challenges and solution approaches for "Scalability".
What is "vertical distribution" and what is an N-tier system? Discuss the variations of client/server systems. Is a Java applet hence a thick or a thin client?
What is horizontal distribution? Explain the design-space of horizontal distribution. Is there a relation to vertical distribution?

Communication (1) - Middleware and RPC

Book: 2.1 (rep'd), 2.3, 2.4, 4.1, 4.2

Questions:

Explain the ISO-OSI model of layered protocols (in principle) and relate it to TCP/IP. Why are transport layer protocols typically not sufficient for distributed systems?
What is "Middleware"? What are the requirements and services of middleware. Explain the relation between middleware and architectural styles.
How can we enhance the flexibility of middleware and improve the middleware/application interaction? Explain the concepts of interceptors, adaptivity, and self-management.
Explain the principles of the Remote Procedure Call (RPC). What are "stubs"?
How can parameters be passed in a procedure call in principle? How can they be passed in RPC and what are the limitations? What is "parameter marshalling"?
How can one write client and servcer for RPC and what is the meaning of IDL in this context? Why do we need an IDL? What is "binding" in this context?
What kinds of asynchronous RPCs exist? Explain different types of communication in general (persistent/transient and synchronous/asynchronous).

Communication (2) - RMI and Messaging

Book: 4.3, 10.1.1, 10.3, 10.4

Questions:

Explain the principles of distributed objects and the Remote Object (bzw. Method) Invocation RMI. What are the differences between "Compile-time" and "Run-time" objects. Explain the difference between persistent and transient objects.
How does binding work with RMI? Relate this to different types of object references. Compare CORBA and Jave with respect to their way of implementing object references.
What is "static" and "dynamic" invocation of remote methods? Provide an example.
How can parameters be passed in RMI? What properties of object orientation make the difference here and what is the advantage over RPC?
Explain the categories of message-oriented communication and show CORBA messaging as an example (two variants).
What is "Message-oriented Middleware MoM"? Exlpain model and architecture of "Message-Queueing" systems. Explain the primitives Put, Get, Poll und Notify. Discuss possible applications, advantages and disadvatages. Explain the idea of a message broker and its importance for EAI.

Operating System Support

Book: 10.2.1, Chp. 3
You can dig into the topic with a Paper on the development of User Interface Tools.
In the CORBA spezification in Chpt.11 (POA - Portable Object Adapter) you can find a particular implementation of the object adapter principle.

Fragen:

Explain the difference between process and thread. What has the developer take care of when using multithreading? Explain the importance of threads in distributed systems, in particular client/server systems.
Which aspects of distributed systems are important when designing the client? How are user interfraces integrated into the architecture of a distributed system? How can different types of transparencies be supported?
Explain some basic design considerations for servers. Describe the difference between stateful and stateless. Depict architecture and function of a multi-threaded server (e.g., a file or web server).
What are the particularities of object servers? How can the invocation of an object be realised on the server side (e.g., policies regarding thread, code sharing, and object creation/activation)? What is an object adapter?
Exlpain the most important aspects of code migration, including strong and weak mobility. Provide an example for weak mobility.
Describe the concept of virtualization and the meaning for code migration in distributed systems. What different kinds of virtual machines do you know?

Naming and Discovery

Book: Chp. 5

Qustions:

Explain the terms "Name", "Identifier", and "Address" and relate them.
What is a "Name Space"? Explain the basic principle of the "Closure Mechanism" and provide an example (e.g., Unix file system or DNS).
Describe the different layers of distributed name spaces and how they may benefit from replication and caching. Explain the difference between iterative and recursive name resolution.
Explain the Domain Name System DNS in principle, and the process of name resolution based on the structure of the DNS Database (Resource Records). What is reverse lookup? What is a zone-transfer?
What is a Directory Service resp. "Attribute-based naming"? Describe the basic structure of the X.500 name space and its LDAP implementation.
How does a location service work in a flat namespace? Describe different solution approaches with respect to mobility and discovery. Exlpain pro's and con's of "Forewarding Pointers". How do "Home-based approaches" work for mobile devices?

Clocks and Agreement

Book: Chp. 6
Extract (c) Pearson/Prentice Hall (for the question regarding the "Global state")

Questions:

Why do we need clock synchronization? Explain NTP and the Berkeley algorithm. What is the challenge when synchronizing physical cclocks?
Why do we use logical clocks and what are the differences to physical clocks? What is the "happened-before" relation and how do "Lamport-Timestamps" work?
What are the disadvantages of Lamport timestamps and how can they be improved with vector timestamps?
How does distributed mutual exclusion work? Compare different algorithms (centralized, distributed, token-ring) with respect to scalability and fault tolerance?
Explain the "Bully" and the"Ring"-algorithms for election and compare them with respect to fault tolerance and safety. Why are these algorithms less useful for large-scale systems and which solution approaches are applicable there?
What are the problems for recording the global state of a distributed system and how can they be solved? Explain the algorithm.

Consistency and Replication

Book: 7.1, 7.4, 7.5, 10.6, 4.5.2

Questions:

What are the main reasons for using replication in distributed systems? How are replication and scalability related? Explain different ways of content replication and content placement.
Compare different ways of update propagation (content distribution).
How do "primary-based" protocols work? Compare different alternatives.
Explain "replicated-write" protocols and compare different alternatives. What are the challenges of "active replication"? How do "quorum-based" replication protocols work and hox can they avoid read-write and write-write conflicts?
What are the particularities with object replication? Depict a possible realization of replication transparency in object systems ("replicated invocation").
What are epidemic protocols? What are the advatages and disadvantages? Explain "gossiping" ("rumor spreading") as one model for replica update propagation. What is the Anti-Entropy Modell in this context?

Dependability and Fault Tolerance

Book: 8.1 - 8.4
Extract (c) Pearson/Prentice Hall (for the question regarding the "Two army problem")
Regarding the question of the byzantine generals, please refer to the slides instead of the book.

Questions:

Explain the basic terms of Dependability: Name the five attributes (requirements) of a "dependable system". What is the difference between availability und reliability? Explain the "dependability threats" Failure, Error and Fault and the relation between them. Explain "permanent", "transient" and "intermittent" faults and provide examples.
Why do we need a failure model? Provide different failure models for a "fail-controlled systems" and discuss them with respect to the required effort for masking faults.Why is it awkward to specify a system as "k-fault-tolerant"?
Why do we need redundancy for masking faults? What kinds of redundancy do you know?
Explain the proposition of the "two-army" problem.
Explain the proposition of the "byzantine generals".
Explain the failure classes in client/server systems. What is the "lost reply" problem?
What is reliable and ordered multicast (group communication) in static process groups? What has to be taken care of, when the groups change dynamically? Explain the concept of "atomic multicast" ("virtual synchrony").

Security

Book: Chp. 9, 12.8

Questions:

Define "security" with its attributes "availability", "confidentiality", and "integrity" and give examples in your explanation. Describe the four security threats and mechanisms to ensure security.
Give a (valid) definition for "cryptography" and explain the core principle of symmetric and asymmetric encryption mechanisms. Further explain the specific advantages and disadvantages of the different algorithms.
Which kind of protection can be provided by a secure channel? Give examples of insecure communication where different requirements to a secure channel are not satisfied. Explain how two communication partners can perform mutual authentication based on public key cryptography.
Explain how two communication partners can perform mutual authentication based on symmetric key cryptography. Show an example of what can happen if an authentication protocol is "optimized" incorrectly.
What is meant by the term "digital signature" and which guarantees are given by such a signature? What is a hash function and which properties does it have to provide in order to be usable in the construction of a digital signature. Why is the use of a hash function an advantage?
What is a Key Distribution Centre (KDC), what is it used for, and which advantage does it provide over not using a KDC? Show based on a protocol how a KDC can be used to perform mutual authentication of communication partners. Explain for each step, which partner has already authenticated to which other partner(s).
One of the core issues of secure communication is distribution of the initial keys. Describe ways of how to securely establish and distribute keys and which guarantees must be provided by the channel over which a key is distributed. Additionally, explain the concept of a public key certificate and a certificate revocation list.

Technology Overview

Book: 10.1.2, Overview of chp. 12 and 13

Questions:

In the taxonomy below, relate the four types of communication models to specific techniques, systems, or examples:

Temporal
Coupled Uncoupled
Referential Coupled (a) Direct (b) Mailbox
Uncoupled (c) Meeting oriented (d) Generative Communication
Explain the principle of publish/subscribe as communication and coordination mechanism. What are the benefits and drawbacks of this mechanism (and its implementation)? Explain JINI and JavaSpaces as an example.
What is a web server? What is the purpose of the Hypertext Transfer Protokoll HTTP? Discuss the basic idea of CGI.

Contact

Contact and E-Mail: Dr. Karl. M. Göschka

This space is left blank intentionally.

		Temporal
		Coupled	Uncoupled
Referential	Coupled	(a) Direct	(b) Mailbox
Referential	Uncoupled	(c) Meeting oriented	(d) Generative Communication