The Grid: Core Technologies / Edition 1

The Grid: Core Technologies / Edition 1

ISBN-10:
0470094176
ISBN-13:
9780470094174
Pub. Date:
05/06/2005
Publisher:
Wiley
ISBN-10:
0470094176
ISBN-13:
9780470094174
Pub. Date:
05/06/2005
Publisher:
Wiley
The Grid: Core Technologies / Edition 1

The Grid: Core Technologies / Edition 1

Paperback

$123.95
Current price is , Original price is $123.95. You
$123.95 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores
  • SHIP THIS ITEM

    Temporarily Out of Stock Online

    Please check back later for updated availability.


Overview

Find out which technologies enable the Grid and how to employ them successfully!

This invaluable text provides a complete, clear, systematic, and practical understanding of the technologies that enable the Grid. The authors outline all the components necessary to create a Grid infrastructure that enables support for a range of wide-area distributed applications. The Grid: Core Technologies takes a pragmatic approach with numerous practical examples of software in context. It describes the middleware components of the Grid step-by-step, and gives hands-on advice on designing and building a Grid environment with the Globus Toolkit, as well as writing applications.

The Grid: Core Technologies:

  • Provides a solid and up-to-date introduction to the technologies that underpin the Grid.
  • Contains a systematic explanation of the Grid, including its infrastructure, basic services, job management, user interaction, and applications.
  • Explains in detail OGSA (Open Grid Services Architecture), Web Services technologies (SOAP, WSDL, UDDI), and Grid Monitoring.
  • Covers Web portal-based tools such as the Java CoG, GridPort, GridSphere, and JSR 168 Portlets.
  • Tackles hot topics such as WSRF (Web Services Resource Framework), the Semantic Grid, the Grid Security Infrastructure, and Workflow systems.
  • Offers practical examples to enhance the understanding and use of Grid components and the associated tools.

This rich resource will be essential reading for researchers and postgraduate students in computing and engineering departments, IT professionals in distributed computing, as well as Grid end users such as physicists, statisticians, biologists and chemists.


Product Details

ISBN-13: 9780470094174
Publisher: Wiley
Publication date: 05/06/2005
Pages: 456
Product dimensions: 6.63(w) x 9.74(h) x 1.02(d)

About the Author

Dr Maozhen Li is currently Lecturer in Electronics and Computer Engineering, in the School of Engineering and Design at Brunel University, UK. From January 1999 to January 2002, he was Research Associate in the Department of Computer Science, Cardiff University, UK. Dr Li received his PhD degree in 1997, from the Institute of Software, Chinese Academy of Sciences, Beijing, China. His research interests are in the areas of Grid computing, problem-solving environments for large-scale simulations, software agents for semantic information retrieval, multi-modal user interface design and computer support for cooperative work. Since 1997, Dr Li has published 30 research papers in prestigious international journals and conferences.

Dr Mark Baker is a hardworking Reader in Distributed Systems at the University of Portsmouth. He also currently holds visiting chairs at the universities of Reading and Westminster. Mark has resided in the relative safety of academia since leaving the British Merchant, where he was a navigating officer, in the early 1980s. Mark has held posts at various universities, including Cardiff, Edinburgh and Syracuse. He has a number of geek-like interests, which his research group at Portsmouth help him pursue. These include wide-area resource monitoring, messaging systems for parallel and wide-area applications, middleware such as information and security services, as well as performance evaluation and modelling of computer systems.
Mark’s non-academic interests include squash (getting too old), DIY (he may one day finish his house off), reading (far too many science fiction books), keeping the garden ship-shape and a beer or two to reduce the pain of the aforementioned activities.

Read an Excerpt

The Grid

Core Technologies
By Maozhen Li Mark Baker

John Wiley & Sons

Copyright © 2005 John Wiley & Sons, Ltd
All right reserved.

ISBN: 0-470-09417-6


Chapter One

An Introduction to the Grid

1.1 INTRODUCTION

The Grid concepts and technologies are all very new, first expressed by Foster and Kesselman in 1998. Before this, efforts to orchestrate wide-area distributed resources were known as metacomputing. Even so, whichever date we use to identify when efforts in this area started, compared to general distributed computing, the Grid is a very new discipline and its exact focus and the core components that make up its infrastructure are still being investigated and have yet to be determined. Generally it can be said that the Grid has evolved from a carefully configured infrastructure that supported a limited number of grand challenge applications executing on high-performance hardware between a number of US national centres, to what we are aiming at today, which can be seen as a seamless and dynamic virtual environment. In this book we take a step-by-step approach to describe the middleware components that make up this virtual environment which is now called the Grid.

1.2 CHARACTERIZATION OF THE GRID

Before we go any further we need to somehow define and characterize what can be seen as a Grid infrastructure. To start with, let us think about the execution of a distributed application. Here we usually visualize running such an application "on top" of a software layer called middleware that unifies the resources being used by the application into a single coherent virtual machine. To help understand this view of a distributed application and its accompanying middleware, consider Figure 1.1, which shows the hardware and software components that would be typically found on a PC-based cluster. This view then raises the question, what is the difference between a distributed system and the Grid? Obviously the Grid is a type of distributed system, but this does not really answer the question. So, perhaps we should try and establish "What is a Grid?"

In 1998, Ian Foster and Carl Kesselman provided an initial definition in their book The Grid: Blueprint for a New Computing Infrastructure: "A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities." This particular definition stems from the earlier roots of the Grid, that of interconnecting high-performance facilities at various US laboratories and universities.

Since this early definition there have been a number of other attempts to define what a Grid is. For example, "A grid is a software framework providing layers of services to access and manage distributed hardware and software resources" or a "widely distributed network of high-performance computers, stored data, instruments, and collaboration environments shared across institutional boundaries". In 2001, Foster, Kesselman and Tuecke refined their definition of a Grid to "coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations". This latest definition is the one most commonly used today to abstractly define a Grid.

Foster later produced a checklist that could be used to help understand exactly what can be identified as a Grid system. He suggested that the checklist should have three parts to it. (The first part to check off is that there is coordinated resource sharing with no centralized point of control that the users reside within different administrative domains.) If this is not true, it is probably the case that this is not a Grid system. The second part to check off is the use of standard, open, general-purpose protocols and interfaces. If this is not the case it is unlikely that system components will be able to communicate or interoperate, and it is likely that we are dealing with an application-specific system, and not the Grid. The final part to check off is that of delivering non-trivial qualities of service. Here we are considering how the components that make up a Grid can be used in a coordinated way to deliver combined services, which are appreciably greater than the sum of the individual components. These services may be associated with throughput, response time, meantime between failure, security or many other facets.

From a commercial view point, IBM define a grid as "a standards-based application/resource sharing architecture that makes it possible for heterogeneous systems and applications to share, compute and storage resources transparently".

So, overall, we can say that the Grid is about resource sharing; this includes computers, storage, sensors and networks. Sharing is obviously always conditional and based on factors like trust, resource-based policies, negotiation and how payment should be considered. The Grid also includes coordinated problem solving, which is beyond simple client-server paradigm, where we may be interested in combinations of distributed data analysis, computation and collaboration. The Grid also involves dynamic, multi-institutional Virtual Organizations (VOs), where these new communities overlay classical organization structures, and these virtual organizations may be large or small, static or dynamic. The LHC Computing Grid Project at CERN is a classic example of where VOs are being used in anger.

1.3 GRID-RELATED STANDARDS BODIES

For Grid-related technologies, tools and utilities to be taken up widely by the community at large, it is vital that developers design their software to conform to the relevant standards. For the Grid community, the most important standards organizations are the Global Grid Forum (GGF), which is the primary standards setting organization for the Grid, and OASIS, a not-for-profit consortium that drives the development, convergence and adoption of e-business standards, which is having an increasing influence on Grid standards. Other bodies that are involved with related standards efforts are the Distributed Management Task Force (DMTF), here there are overlaps and on-going collaborative efforts with the management standards, the Common Information Model (CIM) and the Web-Based Enterprise Management (WBEM). In addition, the World Wide Web Consortium (W3C) is also active in setting Web services standards, particularly those that relate to XML.

The GGF produces four document types related to standards that are defined as:

Informational: These are used to inform the community about a useful idea or set of ideas, for example GFD.7 (A Grid Monitoring Architecture), GFD.8 (A Simple Case Study of a Grid Performance System) and GFD.11 (Grid Scheduling Dictionary of Terms and Keywords). There are currently eighteen Informational documents from a range of working groups.

Experimental: These are used to inform the community about a useful experiment, testbed or implementation of an idea or set of ideas, for example GFD.5 (Advanced Reservation API), GFD.21 (GridFTP Protocol Improvements) and GFD.24 (GSS-API Extensions). There are currently three Experimental documents.

Community practice: These are to inform the community of common practice or process, with the objective to influence the community, for example GFD.1 (GGF Document Series), GFD.3 (GGF Management) and GFD.16 (GGF Certificate Policy Model). There are currently four Common Practice documents. Recommendations: These are used to document a specification, analogous to an Internet Standards track document, for example GFD.15 (Open Grid Services Infrastructure), GFD.20 (GridFTP:

Protocol Extensions to FTP for the Grid) and GFD.23 (A Hierarchy of Network Performance Characteristics for Grid Applications and Services). There are currently four Recommendation documents.

1.4 THE ARCHITECTURE OF THE GRID

Perhaps the most important standard that has emerged recently is the Open Grid Services Architecture (OGSA), which was developed by the GGF. OGSA is an Informational specification that aims to define a common, standard and open architecture for Grid-based applications. The goal of OGSA is to standardize almost all the services that a grid application may use, for example job and resource management services, communications and security. OGSA specifies a Service-Oriented Architecture (SOA) for the Grid that realizes a model of a computing system as a set of distributed computing patterns realized using Web services as the underlying technology. Basically, the OGSA standard defines service interfaces and identifies the protocols for invoking these services.

OGSA was first announced at GGF4 in February 2002. In March 2004, at GGF10, it was declared as the GGF's flagship architecture. The OGSA document, first released at GGF11 in June 2004, explains the OGSA Working Group's current thinking on the required capabilities and was released in order to stimulate further discussion. Instantiations of OGSA depend on emerging specifications (e.g. WS-RF and WS-Notification). Currently the OGSA document does not contain sufficient information to develop an actual implementation of an OSGA-based system. A comprehensive analysis of OGSA was undertaken by Gannon et al., and is well worth reading.

There are many standards involved in building a service-oriented Grid architecture, which form the basic building blocks that allow applications execute service requests. The Web services-based standards and specifications include:

Program-to-program interaction (SOAP, WSDL and UDDI);

Data sharing (eXtensible Markup Language - XML);

Messaging (SOAP and WS-Addressing);

Reliable messaging (WS-ReliableMessaging);

Managing workload (WS-Management);

Transaction-handling (WS-Coordination and WS-AtomicTransaction);

Managing resources (WS-RF or Web Services Resource Framework);

Establishing security (WS-Security, WS-SecureConversation, WS-Trust and WS-Federation);

Handling metadata (WSDL, UDDI and WS-Policy);

Building and integrating Web Services architecture over a Grid (see OGSA);

Overlaying business process flow (Business Process Execution Language for Web Services - BPEL4WS);

Triggering process flow events (WS-Notification).

As the aforementioned list indicates, developing a solid and concrete instantiation of OGSA is currently difficult as there is a moving target - as the choice of which standard or specification will emerge and/or become popular is unknown. This is causing the Grid community a dilemma as to exactly what route to use to develop their middleware. For example, WS-GAF and WS-I are being mooted as possible alternative routes to WS-RF.

Later in this book (Chapters 2 and 3), we describe in depth what is briefly outlined here in Sections 1.2-1.4.

(Continues...)



Excerpted from The Grid by Maozhen Li Mark Baker Copyright © 2005 by John Wiley & Sons, Ltd. Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

About the Authors xiii

Preface xv

Acknowledgements xix

List of Abbreviations xxi

1 An Introduction to the Grid 1

1.1 Introduction 1

1.2 Characterization of the Grid 1

1.3 Grid-Related Standards Bodies 4

1.4 The Architecture of the Grid 5

1.5 References 6

Part One System Infrastructure 9

2 OGSA and WSRF 11

Learning Objectives 11

Chapter Outline 11

2.1 Introduction 12

2.2 Traditional Paradigms for Distributed Computing 13

2.2.1 Socket programming 14

2.2.2 RPC 15

2.2.3 Java RMI 16

2.2.4 DCOM 18

2.2.5 CORBA 19

2.2.6 A summary on Java RMI, DCOM and CORBA 20

2.3 Web Services 21

2.3.1 SOAP 23

2.3.2 WSDL 24

2.3.3 UDDI 26

2.3.4 WS-Inspection 27

2.3.5 WS-Inspection and UDDI 28

2.3.6 Web services implementations 29

2.3.7 How Web services benefit the Grid 33

2.4 OGSA 34

2.4.1 Service instance semantics 35

2.4.2 Service data semantics 37

2.4.3 OGSA portTypes 38

2.4.4 A further discussion on OGSA 40

2.5 The Globus Toolkit 3 (GT3) 40

2.5.1 Host environment 41

2.5.2 Web services engine 42

2.5.3 Grid services container 42

2.5.4 GT3 core services 43

2.5.5 GT3 base services 44

2.5.6 The GT3 programming model 50

2.6 OGSA-DAI 53

2.6.1 OGSA-DAI portTypes 54

2.6.2 OGSA-DAI functionality 56

2.6.3 Services interaction in the OGSA-DAI 58

2.6.4 OGSA-DAI and DAIS 59

2.7 WSRF 60

2.7.1 An introduction to WSRF 60

2.7.2 WSRF and OGSI/GT3 66

2.7.3 WSRF and OGSA 69

2.7.4 A summary of WSRF 70

2.8 Chapter Summary 70

2.9 Further Reading and Testing 72

2.10 Key Points 72

2.11 References 73

3 The Semantic Grid and Autonomic Computing 77

Learning Outcomes 77

Chapter Outline 77

3.1 Introduction 78

3.2 Metadata and Ontology in the Semantic Web 79

3.2.1 RDF 81

3.2.2 Ontology languages 83

3.2.3 Ontology editors 87

3.2.4 A summary of Web ontology languages 88

3.3 Semantic Web Services 88

3.3.1 DAML-S 89

3.3.2 OWL-S 90

3.4 A Layered Structure of the Semantic Grid 91

3.5 Semantic Grid Activities 92

3.5.1 Ontology-based Grid resource matching 93

3.5.2 Semantic workflow registration and discovery in myGrid 94

3.5.3 Semantic workflow enactment in Geodise 95

3.5.4 Semantic service annotation and adaptation in ICENI 98

3.5.5 PortalLab – A Semantic Grid portal toolkit 99

3.5.6 Data provenance on the Grid 106

3.5.7 A summary on the Semantic Grid 107

3.6 Autonomic Computing 108

3.6.1 What is autonomic computing? 108

3.6.2 Features of autonomic computing systems 109

3.6.3 Autonomic computing projects 110

3.6.4 A vision of autonomic Grid services 113

3.7 Chapter Summary 114

3.8 Further Reading and Testing 115

3.9 Key Points 116

3.10 References 116

Part Two Basic Services 121

4 Grid Security 123

4.1 Introduction 123

4.2 A Brief Security Primer 124

4.3 Cryptography 127

4.3.1 Introduction 127

4.3.2 Symmetric cryptosystems 128

4.3.3 Asymmetric cryptosystems 129

4.3.4 Digital signatures 130

4.3.5 Public-key certificate 130

4.3.6 Certification Authority (CA) 132

4.3.7 Firewalls 133

4.4 Grid Security 134

4.4.1 The Grid Security Infrastructure (GSI) 134

4.4.2 Authorization modes in GSI 136

4.5 Putting it all Together 140

4.5.1 Getting an e-Science certificate 140

4.5.2 Managing credentials in Globus 146

4.5.3 Generate a client proxy 148

4.5.4 Firewall traversal 148

4.6 Possible Vulnerabilities 149

4.6.1 Authentication 149

4.6.2 Proxies 149

4.6.3 Authorization 150

4.7 Summary 151

4.8 Acknowledgements 151

4.9 Further Reading 151

4.10 References 152

5 Grid Monitoring 153

5.1 Introduction 153

5.2 Grid Monitoring Architecture (GMA) 154

5.2.1 Consumer 155

5.2.2 The Directory Service 156

5.2.3 Producers 157

5.2.4 Monitoring data 159

5.3 Review Criteria 161

5.3.1 Scalable wide-area monitoring 161

5.3.2 Resource monitoring 161

5.3.3 Cross-API monitoring 161

5.3.4 Homogeneous data presentation 162

5.3.5 Information searching 162

5.3.6 Run-time extensibility 162

5.3.7 Filtering/fusing of data 163

5.3.8 Open and standard protocols 163

5.3.9 Security 163

5.3.10 Software availability and dependencies 163

5.3.11 Projects that are active and supported; plus licensing 163

5.4 An Overview of Grid Monitoring Systems 164

5.4.1 Autopilot 164

5.4.2 Control and Observation in Distributed Environments (CODE) 168

5.4.3 GridICE 172

5.4.4 Grid Portals Information Repository (GPIR) 176

5.4.5 GridRM 180

5.4.6 Hawkeye 185

5.4.7 Java Agents for Monitoring and Management (JAMM) 189

5.4.8 MapCenter 192

5.4.9 Monitoring and Discovery Service (MDS3) 196

5.4.10 Mercury 201

5.4.11 Network Weather Service 205

5.4.12 The Relational Grid Monitoring Architecture (R-GMA) 209

5.4.13 visPerf 214

5.5 Other Monitoring Systems 217

5.5.1 Ganglia 217

5.5.2 GridMon 219

5.5.3 GRM/PROVE 220

5.5.4 Nagios 221

5.5.5 NetLogger 222

5.5.6 SCALEA-G 223

5.6 Summary 225

5.6.1 Resource categories 225

5.6.2 Native agents 225

5.6.3 Architecture 226

5.6.4 Interoperability 226

5.6.5 Homogeneous data presentation 226

5.6.6 Intrusiveness of monitoring 227

5.6.7 Information searching and retrieval 231

5.7 Chapter Summary 233

5.8 Further Reading and Testing 236

5.9 Key Points 236

5.10 References 236

Part Three Job Management and User Interaction 241

6 Grid Scheduling and Resource Management 243

Learning Objectives 243

Chapter Outline 243

6.1 Introduction 244

6.2 Scheduling Paradigms 245

6.2.1 Centralized scheduling 245

6.2.2 Distributed scheduling 246

6.2.3 Hierarchical scheduling 248

6.3 How Scheduling Works 248

6.3.1 Resource discovery 248

6.3.2 Resource selection 251

6.3.3 Schedule generation 251

6.3.4 Job execution 254

6.4 A Review of Condor, SGE, PBS and LSF 254

6.4.1 Condor 254

6.4.2 Sun Grid Engine 269

6.4.3 The Portable Batch System (PBS) 274

6.4.4 LSF 279

6.4.5 A comparison of Condor, SGE, PBS and LSF 288

6.5 Grid Scheduling with QoS 290

6.5.1 AppLeS 291

6.5.2 Scheduling in GrADS 293

6.5.3 Nimrod/G 293

6.5.4 Rescheduling 295

6.5.5 Scheduling with heuristics 296

6.6 Chapter Summary 297

6.7 Further Reading and Testing 298

6.8 Key Points 298

6.9 References 299

7 Workflow Management for the Grid 301

Learning Outcomes 301

Chapter Outline 301

7.1 Introduction 302

7.2 The Workflow Management Coalition 303

7.2.1 The workflow enactment service 305

7.2.2 The workflow engine 306

7.2.3 WfMC interfaces 308

7.2.4 Other components in the WfMC reference model 309

7.2.5 A summary of WfMC reference model 310

7.3 Web Services-Oriented Flow Languages 310

7.3.1 XLANG 311

7.3.2 Web services flow language 311

7.3.3 WSCI 313

7.3.4 BPEL4WS 315

7.3.5 BPML 317

7.3.6 A summary of Web services flow languages 318

7.4 Grid Services-Oriented Flow Languages 318

7.4.1 GSFL 318

7.4.2 SWFL 321

7.4.3 GWEL 321

7.4.4 GALE 322

7.4.5 A summary of Grid services flow languages 323

7.5 Workflow Management for the Grid 323

7.5.1 Grid workflow management projects 323

7.5.2 A summary of Grid workflow management 329

7.6 Chapter Summary 330

7.7 Further Reading and Testing 331

7.8 Key Points 332

7.9 References 332

8 Grid Portals 335

Learning Outcomes 335

Chapter Outline 335

8.1 Introduction 336

8.2 First-Generation Grid Portals 337

8.2.1 A three-tiered architecture 337

8.2.2 Grid portal services 338

8.2.3 First-generation Grid portal implementations 339

8.2.4 First-generation Grid portal toolkits 341

8.2.5 A summary of the four portal tools 348

8.2.6 A summary of first-generation Grid portals 349

8.3 Second-Generation Grid Portals 350

8.3.1 An introduction to portlets 350

8.3.2 Portlet specifications 355

8.3.3 Portal frameworks supporting portlets 357

8.3.4 A Comparison of Jetspeed, WebSphere Portal and GridSphere 368

8.3.5 The development of Grid portals with portlets 369

8.3.6 A summary on second-generation Grid portals 371

8.4 Chapter Summary 372

8.5 Further Reading and Testing 373

8.6 Key Points 373

8.7 References 374

Part Four Applications 377

9 Grid Applications – Case Studies 379

Learning Objectives 379

Chapter Outline 379

9.1 Introduction 380

9.2 GT3 Use Cases 380

9.2.1 GT3 in broadcasting 381

9.2.2 GT3 in software reuse 382

9.2.3 A GT3 bioinformatics application 387

9.3 OGSA-DAI Use Cases 387

9.3.1 eDiaMoND 387

9.3.2 ODD-Genes 388

9.4 Resource Management Case Studies 388

9.4.1 The UCL Condor pool 388

9.4.2 SGE use cases 389

9.5 Grid Portal Use Cases 390

9.5.1 Chiron 390

9.5.2 GENIUS 390

9.6 Workflow Management – Discovery Net Use Cases 391

9.6.1 Genome annotation 391

9.6.2 SARS virus evolution analysis 391

9.6.3 Urban air pollution monitoring 392

9.6.4 Geo-hazard modelling 394

9.7 Semantic Grid – myGrid Use Case 394

9.8 Autonomic Computing – AutoMate Use Case 395

9.9 Conclusions 397

9.10 References 398

Glossary 401

Index 419

What People are Saying About This

From the Publisher

"It could serve as a good textbook and would certainly be a good addition to the reference libraries of technologists, academics, and students." (IEEE Distributed Systems Online, December 2006)

"…lots of valuable information." (Computing Reviews.com, May 11, 2006)

"…a complete, clear, systematic, and practical understanding of the technologies that enable the Grid." (IEEE Computer Magazine, August 2005)

"…a good addition to the reference library…" (IEEE DS Online, January 2007)

From the B&N Reads Blog

Customer Reviews