Distributed Operating Systems and Algorithm Analysis / Edition 1

Distributed Operating Systems and Algorithm Analysis / Edition 1

ISBN-10:
0201498383
ISBN-13:
9780201498387
Pub. Date:
03/18/1997
Publisher:
Addison-Wesley
ISBN-10:
0201498383
ISBN-13:
9780201498387
Pub. Date:
03/18/1997
Publisher:
Addison-Wesley
Distributed Operating Systems and Algorithm Analysis / Edition 1

Distributed Operating Systems and Algorithm Analysis / Edition 1

$159.4 Current price is , Original price is $159.4. You
$159.40 
  • SHIP THIS ITEM
    This item is available online through Marketplace sellers.
  • PICK UP IN STORE
    Check Availability at Nearby Stores
$54.15 
  • SHIP THIS ITEM

    Temporarily Out of Stock Online

    Please check back later for updated availability.

    • Condition: Good
    Note: Access code and/or supplemental material are not guaranteed to be included with used textbook.

This item is available online through Marketplace sellers.


Overview

Distributed Operating Systems and Algorithms integrates into one text both the theory and implementation aspects of distributed operating systems for the first time. This innovative book provides the reader with knowledge of the important algorithms necessary for an in-depth understanding of distributed systems; at the same time it motivates the study of these algorithms by presenting a systems framework for their practical application.

The first part of the book is intended for use in an advanced course on operating systems and concentrates on parallel systems, distributed systems, real-time systems, and computer networks. The second part of the text is written for a course on distributed algorithms with a focus on algorithms for asynchronous distributed systems. While each of the two parts is self-contained, extensive cross-referencing allows the reader to emphasize either theory or implementation or to cover both elements of selected topics.

Features:

  • Integrates and balances coverage of the advanced aspects of operating systems with the distributed algorithms used by these systems.
  • Includes extensive references to commercial and experimental systems to illustrate the concepts and implementation issues.
  • Provides precise algorithm description and explanation of why these algorithms were developed.
  • Structures the coverage of algorithms around the creation of a framework for implementing a replicated server-a prototype for implementing a fault-tolerant and highly available distributed system.
  • Contains programming projects on such topics as sockets, RPC, threads, and implementation of distributed algorithms using these tools.
  • Includes an extensive annotated bibliography for each chapter, pointing the reader to recent developments.
  • Solutions to selected exercises, templates to programming problems, a simulator for algorithms for distributed synchronization, and teaching tips for selected topics are available to qualified instructors from Addison Wesley.

0201498383B04062001


Product Details

ISBN-13: 9780201498387
Publisher: Addison-Wesley
Publication date: 03/18/1997
Edition description: New Edition
Pages: 569
Product dimensions: 7.48(w) x 8.97(h) x 1.33(d)

About the Author

About Randy Chow

Randy Chow is a professor of Computer and Information Science and Engineering at the University of Florida. His research interests include computer networks, distributed systems, computer security, and system performance evaluation.

Theodore Johnson is a member of the technical staff at the Database Research department of AT&T Labs-Research. Previously, he was a professor of Computer and Information Science and Engineering at the University of Florida. His research interests include distributed systems, databases, and performance modeling.

0201498383AB04062001

Read an Excerpt

PREFACE

This book has developed from a set of lecture notes we used to teach a graduate two-coursesequence on operating systems. We had difficulties finding a textbook that hadbalanced coverage of concepts and theories and sufficient detail for a graduate-levelclass. After using research papers and selected book chapters for several semesters (whichstudents hate), we developed lecture notes for the students and then developed the notesinto this book.

The key idea behind the book is to integrate the theory and practice of distributedoperating systems. Many texts provide an overview of distributed systems with a vaguedescription of the algorithms. It is difficult to teach an in-depth course from thesebooks. Other texts focus on algorithms. However, it is hard to motivate students to studyalgorithms if they do not have a picture of what is needed in practice. For example,understanding the need for naming and directory services clarifies the need for causalmessage passing and update propagation.

The first part of the book (Chapters 1 through 8) is intended to be used for a coregraduate course on operating systems. It provides a broad discussion of modern operatingsystems. Since we assume that the student readers already have taken an undergraduatecourse on operating systems, we do not include many topics, such as memorymanagement, deadlocks, and device drivers, that are typically covered by undergraduatecourses and that are well covered by existing texts. Instead, we concentrate on issuesand implementations of modern operating systems for parallel systems, distributed systems,real-time systems, and computer networks. The programming projects in this partconcentrate on using tools, such as sockets, RPC, and threads.

The second part of the book (Chapters 9 through 13) is intended for an advanced courseon distributed operating systems, focusing on algorithms for asynchronous distributedsystems. The material provides a framework for thinking about distributed systems, precisealgorithm descriptions, and an understanding of why the algorithms were developed andwhy they are correct. The theme behind much of the material presented in this sectionis providing the framework for implementing a replicated server— a prototype forimplementing a fault-tolerant and highly available distributed system. The projects inthis part use the tools presented in the first part to implement the algorithms.

The different focuses of the two parts of the book complement each other, with thefirst part supplying background and the second part supplying theory. While the book isintended for a two-course sequence, it also works well for a graduate operating systemor an advanced distributed systems class alone. Although each of the two parts is self-contained,extensive cross-referencing allows the reader to emphasize either theory orimplementation or to cover both elements of selected topics. With the widespread exposureof the Internet, we envision that the book could also be of interest to undergraduatestudents and can be tailored to undergraduate classes. Furthermore, the combined bookmakes an excellent reference for the practitioner who has some working experience onoperating systems.

BOOK ORGANIZATION

The book is structured to present the important concepts of distributed operating systems.We make many references to commercial and experimental systems to illustrate theconcepts and the implementation issues. In addition, each chapter provides an extensiveannotated bibliography that points the interested reader to recent developments indistributed computing theory, technology, and practice. In the following we give a briefdescription of the content of each chapter.

Part I of the textbook is a pragmatic presentation of the basic concepts and implementationissues of a distributed operating system. It is organized in three major components:fundamental concepts, distributed processes, and distributed resources.

Fundamental Concepts (Transparency, Service, and Coordination)

  • Chapter 1 introduces a classification of centralized operating system, network operatingsystem, distributed operating system, and cooperative autonomous systems, usingthe key characteristics of virtuality, interoperability, transparency and autonomicity,respectively, for each system. It illustrates the evolution that led to the development ofmodern distributed operating systems and explains the emerging need for distributedsoftware and the importance of distributed coordination algorithms.

  • Chapter 2 begins the discussion of distributed operating systems. It presents theconcepts of transparency and services. Distributed systems and their underlyingcommunication architectures are introduced. The chapter concludes with a list ofmajor system design issues that establishes an order for the presentation of thesubsequent chapters.

Distributed Processes (Synchronization, Communication, andScheduling)
  • Chapter 3 describes concurrent processes and programming. It defines processes andthreads and shows how their interaction can be modeled by using some fundamentalconcepts such as a graph, a logical clock, and the client and server model. Both sharedmemory and message passing for synchronization and communication are addressed.They are presented along with the development of concurrent language constructs.A taxonomy of these language mechanisms and their implementation is given. Thischapter presents an integrated view of synchronization and communication.

  • Chapter 4 extends the discussion of process interaction from synchronization tocommunication and to distributed process coordination using message passing communication.Three communication models, message passing (socket), request/reply(RPC), and transaction communication, are presented. A special emphasis is placedon group communication and coordination. Two classical distributed coordinationproblems, mutual exclusion and leader election using message passing interprocesscommunication, are introduced. These problems are further studied in Chapters 10 and11 in Part II of the textbook. The chapter also includes a presentation of name service,an essential facility for communication in distributed systems.

  • Chapter 5 turns to the third process management issue, that of process scheduling. Theeffect of communication on both static and dynamic process scheduling is emphasized.The chapter describes distributed computation through dynamic redistribution ofprocesses by using remote execution and process migration techniques. It also addressesseveral unique issues in real-time scheduling.

Distributed Resources (Files and Memory)
  • Chapter 6 discusses the distributed implementation of file systems, the first of thetwo important distributed resources: files and memory. It demonstrates the use of theconcept of transparency and service in the design of distributed file systems. Twomajor implementation issues, data caching and file replication, are discussed in thischapter. The chapter also covers distributed transactions as part of the file service.Since management of replicated data touches upon both data and communication,two central issues in distributed systems, it is further detailed in Chapter 12.

  • Chapter 7 covers distributed shared memory systems that simulate a logical sharedmemory on a physically distributed memory system. The issues studied are coherenceand consistency of data due to memory sharing. The chapter describes implementationstrategies for different memory consistency requirements. It also shows the significanceof the object-based data sharing models.

  • Chapter 8 addresses unique security issues in network and distributed environments.These issues are divided into two areas: authorization and authentication. Authorizationincludes the study of distributed access and flow control models. Authenticationcovers cryptography and its applications for mutual authentication and key distributionprotocols. Implementations of some security features in modern systems areillustrated.

    Part II of the textbook discusses distributed algorithms. The discussion is pragmaticand is intended to give the reader a solid understanding of common problems andsolution techniques. The topics are organized in five chapters.

Distributed Algorithms
  • Chapter 9: introduces the concepts of time and global states in a distributed system.The fundamental problem of distributed algorithms is a lack of a global clock and aglobal state. Recent research on vector time and distributed predicates has developedunified models for thinking about distributed time and the distributed state. Thischapter presents the concepts of causality, vector timestamps, and global states. Thealgorithms for implementing these concepts are presented. The connections betweenthe different models are explored. Finally, a model for proving the correctness ofdistributed algorithms is presented.

  • Chapter 10: covers distributed synchronization and distributed election. While thedistributed synchronization algorithms are not considered pragmatic, they illustrateimportant algorithm design techniques. For example, voting algorithms for replicateddata management are foreshadowed in Maekawa's algorithm, and the Chang-Singhal-Liualgorithm illustrates the ideas behind distributed shared memory (and distributedobject) algorithms. The chapter concludes with algorithms for electing a computationleader. Election is a critical component of many systems. The invitation algorithmviiin particular is a prototype for handling failures in an asynchronous system andforeshadows the group view maintenance algorithms of Chapter 12.

  • Chapter 11 discusses the abstract distributed agreement problem. First, Byzantineagreement is discussed. Next, the Fischer-Lynch-Paterson (FLP) result that no algorithmsolves distributed agreement problems in an asynchronous system is coveredin detail. This is the appropriate point to introduce the FLP result, because the nextchapter covers replicated data management and must solve distributed agreement inasynchronous systems. The FLP result leaves open three ways to achieve distributedagreement in an asynchronous system: hope that it happens, use relative agreement,or use a randomized algorithm. The chapter discusses these implications of the FLPresult and concludes with some randomized agreement protocols.

  • Chapter 12 covers replicated data management. Since providing replicated servers reducesto replicating the state of the servers, this section also discusses the problems andconcepts of replication. We cover three main approaches: the transaction approach,the reliable multicast approach, and the log propagation approach. The transactionapproach includes discussion of two-phase commit, three-phase commit, one-copyserializability, voting, and dynamic voting protocols. The reliable multicast approachincludes discussion of virtual synchrony, algorithms for implementing reliable andcausal multicast, algorithms for totally ordered multicast, and consistent multicastgroup maintenance algorithms. The log propagation approach covers naive log propagation,epidemics, and causal log propagation. This chapter is the culmination of PartII of the text and draws together the results presented in previous chapters.

  • Chapter 13 covers distributed rollback and recovery. These techniques are critical forimplementing fault-tolerant systems and are complimentary to the replicated datamanagement techniques of the previous chapter. By using the theory developed in theprevious chapters (especially Chapter 9), different rollback and recovery algorithmsare presented in a unified manner and are related to algorithms discussed previously.

SUGGESTED COVERAGE

The book contains sufficient materials for a two-semester operating system (or distributedsystem) course sequence. To use the book for an one-semester course, we recommend thefollowing two options for coverage with different orientations.

  • Distributed Operating Systems: This option covers the entire Part I with supplementalSections 9.1, 9.2, 10.1.2, 10.1.3, 10.1.4, 10.2, 12.1, and 12.1.3 from Part II. The coursewill focus on the implementation issues of the major components in a distributedoperating system. The supplemental sections in Part II extend the discussion ofdistributed coordination algorithms.

  • Distributed Algorithms: This option uses the entire Part II with Sections 4.1-4.4,6.1-6.3, and 7.1-7.4 from Part I. The course will emphasize the design of distributedalgorithms in distributed systems. The selected sections in Part I serve as thebackground and motivations for the discussion of distributed algorithms in Part II.

ACKNOWLEDGEMENTS

The authors wish to acknowledge the COP5615 (Operating System Principles) and theCOP6635 (Distributed Operating Systems) classes for allowing them to test preliminaryversions of the book. Special thanks go to Anthony Bridgewater, Raja Chatterjee, WilliamChow, Roger Collins, Darin Davis, Carlos Guerra, Steve Greenwald, Vanja Josifovski, EricShaffer, and Shiby Thomas for finding errors in the examples and algorithms.

Several topics of discussion use the research work from theses and dissertations bystudents at the University of Florida. Frank Anger and J. J. Hwang contributed themultiprocessor scheduling models and algorithms. The LAM distributed shared memorysystem was given by Roger Denton. I-Lung Kao shared the discussion of the complexsecurity policies. Lionel Maugis contributed the start of the bibliography and the totalordering algorithm from his thesis.

The comments and feedback from the reviewers were enormously valuable. We wish toshow our appreciation to them for providing their expertise to improve the quality of thebook. They are: Warren R. Carithers of Rochester Institute of Technology, Prasun Dewanof University of North Carolina, Peter G. Drexel of Plymouth State College, Gary Harkin ofMontana State University, Eric H. Herrin, II of University of Kentucky, Mark A. Hollidayof Western Carolina University, Michael A. Keenan of Columbus State University, WilliamF. Klostermeyer of West Virginia University, Bruce Maggs of Carnegie-Mellon University,James Purtilo of University of Maryland, Gurdip Singh of Kansa State University, andSalih Yurttas of Texas A&M University.

The book cover design was inspired by the "turtles all the way down" anecdote whichwas believed to originate from WilliamJames and retold by many philosophers includingBertrand Russell and Steve Hawking. We added the stack of "gators" to show anotherlayered-structured autonomous system. The art work was contributed by Sheng-Wan Hu.Special thanks go to Michael Downes and S. Y. Cheng who helped to solve many Latextechnical problems.

We started writing the book when our wives complained that they did not have anybook dedicated to them as many others do. We would thank Johna and Taiying for givingus such great motivation to write the book. This book is dedicated to them.

Table of Contents

(All chapters, except Chapters 1 and 2, conclude with a Summary and Bibliography, and Exercises.)

Distributed Operating Systems.

Operating System Fundamentals.

Evolution of Modern Operating Systems.

Centralized Operating System Overview.

Network Operating Systems.

Distributed Operating Systems.

Cooperative Autonomous Systems.

Distributed Algorithms.

Systems: Concepts and Architecture’s.

Goals.

Transparency.

Services.

Architecture Models.

Network Communication Protocols Major Design Issues.

Distributed Computing Environment (DCE).

Concurrent Processes and Programming.

Processes and Threads.

Graph Models for Process Representation.

The Client/Server Model Time Services.

Language Mechanisms for Synchronization.

Object Model Resource Servers Concurrent Programming Languages.

Distributed and Network Programming Languages.

Interprocess Communication and Coordination.

Message Passing Communication.

Request/Reply Communication.

Transaction Communication.

Name and Directory Services.

Distributed Mutual Exclusion.

Leader Election.

Distributed Process Scheduling.

A System Performance Model.

Static Process Scheduling with Communication.

Dynamic Load Sharing and Balancing.

Distributed process Implementation.

Real-time Scheduling.

Distributed File Systems.

Transparencies and Characteristics of DFS.

DFS Design and Implementation.

Transaction Service and Concurrency Control.

Data and File Replication.

Distributed Shared Memory.

Non-Uniform Memory Access Architecture’s.

Memory Consistency Models.

Multiprocessor Cache Systems.

Distributed Shared Memory.

Implementation of DSM systems.

Distributed Computer Security.

Fundamentals of Computer Security.

Discretionary Access Control Models.

Mandatory Flow Control Models.

Cryptography.

Distributed Authentication and Key Distribution Issues Relevant to Distributed Security.

Distributed Algorithm.

Models of Distributed Computation.

Preliminaries.

Causality.

Distributed Snapshots.

Modeling a Distributed Computation Failures in a Distributed System.

Synchronization and Election.

Distributed Mutual Exclusion.

Election.

Distributed Agreement.

Adversaries.

Byzantine Agreement.

Impossibility of Consensus.

Randomized Distributed Agreement.

Replicated Data Management.

Database Techniques.

Atomic Multicast.

Update Propagation.

Checkpointing and Recovery.

Problems in Rollback.

Incarnation Numbers.

Taxonomy of Solution Techniques Uncoordinated Checkpointing.

Coordinated Checkpointing.

Synchronous Logging Asynchronous Logging.

Adaptive Logging. 0201498383T04062001

Preface

PREFACE

This book has developed from a set of lecture notes we used to teach a graduate two-coursesequence on operating systems. We had difficulties finding a textbook that hadbalanced coverage of concepts and theories and sufficient detail for a graduate-levelclass. After using research papers and selected book chapters for several semesters (whichstudents hate), we developed lecture notes for the students and then developed the notesinto this book.

The key idea behind the book is to integrate the theory and practice of distributedoperating systems. Many texts provide an overview of distributed systems with a vaguedescription of the algorithms. It is difficult to teach an in-depth course from thesebooks. Other texts focus on algorithms. However, it is hard to motivate students to studyalgorithms if they do not have a picture of what is needed in practice. For example,understanding the need for naming and directory services clarifies the need for causalmessage passing and update propagation.

The first part of the book (Chapters 1 through 8) is intended to be used for a coregraduate course on operating systems. It provides a broad discussion of modern operatingsystems. Since we assume that the student readers already have taken an undergraduatecourse on operating systems, we do not include many topics, such as memorymanagement, deadlocks, and device drivers, that are typically covered by undergraduatecourses and that are well covered by existing texts. Instead, we concentrate on issuesand implementations of modern operating systems for parallel systems, distributed systems,real-time systems, and computer networks. The programming projects in this partconcentrate on using tools, such as sockets, RPC, and threads.

The second part of the book (Chapters 9 through 13) is intended for an advanced courseon distributed operating systems, focusing on algorithms for asynchronous distributedsystems. The material provides a framework for thinking about distributed systems, precisealgorithm descriptions, and an understanding of why the algorithms were developed andwhy they are correct. The theme behind much of the material presented in this sectionis providing the framework for implementing a replicated server-- a prototype forimplementing a fault-tolerant and highly available distributed system. The projects inthis part use the tools presented in the first part to implement the algorithms.

The different focuses of the two parts of the book complement each other, with thefirst part supplying background and the second part supplying theory. While the book isintended for a two-course sequence, it also works well for a graduate operating systemor an advanced distributed systems class alone. Although each of the two parts is self-contained,extensive cross-referencing allows the reader to emphasize either theory orimplementation or to cover both elements of selected topics. With the widespread exposureof the Internet, we envision that the book could also be of interest to undergraduatestudents and can be tailored to undergraduate classes. Furthermore, the combined bookmakes an excellent reference for the practitioner who has some working experience onoperating systems.

BOOK ORGANIZATION

The book is structured to present the important concepts of distributed operating systems.We make many references to commercial and experimental systems to illustrate theconcepts and the implementation issues. In addition, each chapter provides an extensiveannotated bibliography that points the interested reader to recent developments indistributed computing theory, technology, and practice. In the following we give a briefdescription of the content of each chapter.

Part I of the textbook is a pragmatic presentation of the basic concepts and implementationissues of a distributed operating system. It is organized in three major components:fundamental concepts, distributed processes, and distributed resources.

Fundamental Concepts (Transparency, Service, and Coordination)

  • Chapter 1 introduces a classification of centralized operating system, network operatingsystem, distributed operating system, and cooperative autonomous systems, usingthe key characteristics of virtuality, interoperability, transparency and autonomicity,respectively, for each system. It illustrates the evolution that led to the development ofmodern distributed operating systems and explains the emerging need for distributedsoftware and the importance of distributed coordination algorithms.
  • Chapter 2 begins the discussion of distributed operating systems. It presents theconcepts of transparency and services. Distributed systems and their underlyingcommunication architectures are introduced. The chapter concludes with a list ofmajor system design issues that establishes an order for the presentation of thesubsequent chapters.

Distributed Processes (Synchronization, Communication, andScheduling)

  • Chapter 3 describes concurrent processes and programming. It defines processes andthreads and shows how their interaction can be modeled by using some fundamentalconcepts such as a graph, a logical clock, and the client and server model. Both sharedmemory and message passing for synchronization and communication are addressed.They are presented along with the development of concurrent language constructs.A taxonomy of these language mechanisms and their implementation is given. Thischapter presents an integrated view of synchronization and communication.
  • Chapter 4 extends the discussion of process interaction from synchronization tocommunication and to distributed process coordination using message passing communication.Three communication models, message passing (socket), request/reply(RPC), and transaction communication, are presented. A special emphasis is placedon group communication and coordination. Two classical distributed coordinationproblems, mutual exclusion and leader election using message passing interprocesscommunication, are introduced. These problems are further studied in Chapters 10 and11 in Part II of the textbook. The chapter also includes a presentation of name service,an essential facility for communication in distributed systems.
  • Chapter 5 turns to the third process management issue, that of process scheduling. Theeffect of communication on both static and dynamic process scheduling is emphasized.The chapter describes distributed computation through dynamic redistribution ofprocesses by using remote execution and process migration techniques. It also addressesseveral unique issues in real-time scheduling.

Distributed Resources (Files and Memory)

  • Chapter 6 discusses the distributed implementation of file systems, the first of thetwo important distributed resources: files and memory. It demonstrates the use of theconcept of transparency and service in the design of distributed file systems. Twomajor implementation issues, data caching and file replication, are discussed in thischapter. The chapter also covers distributed transactions as part of the file service.Since management of replicated data touches upon both data and communication,two central issues in distributed systems, it is further detailed in Chapter 12.
  • Chapter 7 covers distributed shared memory systems that simulate a logical sharedmemory on a physically distributed memory system. The issues studied are coherenceand consistency of data due to memory sharing. The chapter describes implementationstrategies for different memory consistency requirements. It also shows the significanceof the object-based data sharing models.
  • Chapter 8 addresses unique security issues in network and distributed environments.These issues are divided into two areas: authorization and authentication. Authorizationincludes the study of distributed access and flow control models. Authenticationcovers cryptography and its applications for mutual authentication and key distributionprotocols. Implementations of some security features in modern systems areillustrated.

    Part II of the textbook discusses distributed algorithms. The discussion is pragmaticand is intended to give the reader a solid understanding of common problems andsolution techniques. The topics are organized in five chapters.

Distributed Algorithms

  • Chapter 9: introduces the concepts of time and global states in a distributed system.The fundamental problem of distributed algorithms is a lack of a global clock and aglobal state. Recent research on vector time and distributed predicates has developedunified models for thinking about distributed time and the distributed state. Thischapter presents the concepts of causality, vector timestamps, and global states. Thealgorithms for implementing these concepts are presented. The connections betweenthe different models are explored. Finally, a model for proving the correctness ofdistributed algorithms is presented.
  • Chapter 10: covers distributed synchronization and distributed election. While thedistributed synchronization algorithms are not considered pragmatic, they illustrateimportant algorithm design techniques. For example, voting algorithms for replicateddata management are foreshadowed in Maekawa's algorithm, and the Chang-Singhal-Liualgorithm illustrates the ideas behind distributed shared memory (and distributedobject) algorithms. The chapter concludes with algorithms for electing a computationleader. Election is a critical component of many systems. The invitation algorithmviiin particular is a prototype for handling failures in an asynchronous system andforeshadows the group view maintenance algorithms of Chapter 12.
  • Chapter 11 discusses the abstract distributed agreement problem. First, Byzantineagreement is discussed. Next, the Fischer-Lynch-Paterson (FLP) result that no algorithmsolves distributed agreement problems in an asynchronous system is coveredin detail. This is the appropriate point to introduce the FLP result, because the nextchapter covers replicated data management and must solve distributed agreement inasynchronous systems. The FLP result leaves open three ways to achieve distributedagreement in an asynchronous system: hope that it happens, use relative agreement,or use a randomized algorithm. The chapter discusses these implications of the FLPresult and concludes with some randomized agreement protocols.
  • Chapter 12 covers replicated data management. Since providing replicated servers reducesto replicating the state of the servers, this section also discusses the problems andconcepts of replication. We cover three main approaches: the transaction approach,the reliable multicast approach, and the log propagation approach. The transactionapproach includes discussion of two-phase commit, three-phase commit, one-copyserializability, voting, and dynamic voting protocols. The reliable multicast approachincludes discussion of virtual synchrony, algorithms for implementing reliable andcausal multicast, algorithms for totally ordered multicast, and consistent multicastgroup maintenance algorithms. The log propagation approach covers naive log propagation,epidemics, and causal log propagation. This chapter is the culmination of PartII of the text and draws together the results presented in previous chapters.
  • Chapter 13 covers distributed rollback and recovery. These techniques are critical forimplementing fault-tolerant systems and are complimentary to the replicated datamanagement techniques of the previous chapter. By using the theory developed in theprevious chapters (especially Chapter 9), different rollback and recovery algorithmsare presented in a unified manner and are related to algorithms discussed previously.

SUGGESTED COVERAGE

The book contains sufficient materials for a two-semester operating system (or distributedsystem) course sequence. To use the book for an one-semester course, we recommend thefollowing two options for coverage with different orientations.

  • Distributed Operating Systems: This option covers the entire Part I with supplementalSections 9.1, 9.2, 10.1.2, 10.1.3, 10.1.4, 10.2, 12.1, and 12.1.3 from Part II. The coursewill focus on the implementation issues of the major components in a distributedoperating system. The supplemental sections in Part II extend the discussion ofdistributed coordination algorithms.
  • Distributed Algorithms: This option uses the entire Part II with Sections 4.1-4.4,6.1-6.3, and 7.1-7.4 from Part I. The course will emphasize the design of distributedalgorithms in distributed systems. The selected sections in Part I serve as thebackground and motivations for the discussion of distributed algorithms in Part II.

ACKNOWLEDGEMENTS

The authors wish to acknowledge the COP5615 (Operating System Principles) and theCOP6635 (Distributed Operating Systems) classes for allowing them to test preliminaryversions of the book. Special thanks go to Anthony Bridgewater, Raja Chatterjee, WilliamChow, Roger Collins, Darin Davis, Carlos Guerra, Steve Greenwald, Vanja Josifovski, EricShaffer, and Shiby Thomas for finding errors in the examples and algorithms.

Several topics of discussion use the research work from theses and dissertations bystudents at the University of Florida. Frank Anger and J. J. Hwang contributed themultiprocessor scheduling models and algorithms. The LAM distributed shared memorysystem was given by Roger Denton. I-Lung Kao shared the discussion of the complexsecurity policies. Lionel Maugis contributed the start of the bibliography and the totalordering algorithm from his thesis.

The comments and feedback from the reviewers were enormously valuable. We wish toshow our appreciation to them for providing their expertise to improve the quality of thebook. They are: Warren R. Carithers of Rochester Institute of Technology, Prasun Dewanof University of North Carolina, Peter G. Drexel of Plymouth State College, Gary Harkin ofMontana State University, Eric H. Herrin, II of University of Kentucky, Mark A. Hollidayof Western Carolina University, Michael A. Keenan of Columbus State University, WilliamF. Klostermeyer of West Virginia University, Bruce Maggs of Carnegie-Mellon University,James Purtilo of University of Maryland, Gurdip Singh of Kansa State University, andSalih Yurttas of Texas A&M University.

The book cover design was inspired by the "turtles all the way down" anecdote whichwas believed to originate from WilliamJames and retold by many philosophers includingBertrand Russell and Steve Hawking. We added the stack of "gators" to show anotherlayered-structured autonomous system. The art work was contributed by Sheng-Wan Hu.Special thanks go to Michael Downes and S. Y. Cheng who helped to solve many Latextechnical problems.

We started writing the book when our wives complained that they did not have anybook dedicated to them as many others do. We would thank Johna and Taiying for givingus such great motivation to write the book. This book is dedicated to them.

0201498383P04062001

From the B&N Reads Blog

Customer Reviews