Optimization Algorithms on Matrix Manifolds

Optimization Algorithms on Matrix Manifolds

ISBN-10:
0691132984
ISBN-13:
9780691132983
Pub. Date:
12/23/2007
Publisher:
Princeton University Press
ISBN-10:
0691132984
ISBN-13:
9780691132983
Pub. Date:
12/23/2007
Publisher:
Princeton University Press
Optimization Algorithms on Matrix Manifolds

Optimization Algorithms on Matrix Manifolds

$88.0
Current price is , Original price is $88.0. You
$88.00 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Overview

Many problems in the sciences and engineering can be rephrased as optimization problems on matrix search spaces endowed with a so-called manifold structure. This book shows how to exploit the special structure of such problems to develop efficient numerical algorithms. It places careful emphasis on both the numerical formulation of the algorithm and its differential geometric abstraction—illustrating how good algorithms draw equally from the insights of differential geometry, optimization, and numerical analysis. Two more theoretical chapters provide readers with the background in differential geometry necessary to algorithmic development. In the other chapters, several well-known optimization methods such as steepest descent and conjugate gradients are generalized to abstract manifolds. The book provides a generic development of each of these methods, building upon the material of the geometric chapters. It then guides readers through the calculations that turn these geometrically formulated methods into concrete numerical algorithms. The state-of-the-art algorithms given as examples are competitive with the best existing algorithms for a selection of eigenspace problems in numerical linear algebra.



Optimization Algorithms on Matrix Manifolds offers techniques with broad applications in linear algebra, signal processing, data mining, computer vision, and statistical analysis. It can serve as a graduate-level textbook and will be of interest to applied mathematicians, engineers, and computer scientists.


Product Details

ISBN-13: 9780691132983
Publisher: Princeton University Press
Publication date: 12/23/2007
Pages: 240
Product dimensions: 6.00(w) x 9.25(h) x (d)

About the Author

P.-A. Absil is associate professor of mathematical engineering at the Université Catholique de Louvain in Belgium. R. Mahony is reader in engineering at the Australian National University. R. Sepulchre is professor of electrical engineering and computer science at the University of Liège in Belgium.

Read an Excerpt

Optimization Algorithm on Matrix Manifolds


By P.-A. Absil R. Mahony R. Sepulchre Princeton University Press
Copyright © 2007
Princeton University Press
All right reserved.

ISBN: 978-0-691-13298-3


Chapter One Introduction

This book is about the design of numerical algorithms for computational problems posed on smooth search spaces. The work is motivated by matrix optimization problems characterized by symmetry or invariance properties in the cost function or constraints. Such problems abound in algorithmic questions pertaining to linear algebra, signal processing, data mining, and statistical analysis. The approach taken here is to exploit the special structure of these problems to develop efficient numerical procedures.

An illustrative example is the eigenvalue problem. Because of their scale invariance, eigenvectors are not isolated in vector spaces. Instead, each eigendirection defines a linear subspace of eigenvectors. For numerical computation, however, it is desirable that the solution set consist only of isolated points in the search space. An obvious remedy is to impose a norm equality constraint on iterates of the algorithm. The resulting spherical search space is an embedded submanifold of the original vector space. An alternative approach is to "factor" the vector space by the scaleinvariant symmetry operation such that any subspace becomes a single point. The resulting search space is a quotient manifold of the original vector space. These two approaches provide prototypestructures for the problems considered in this book.

Scale invariance is just one of several symmetry properties regularly encountered in computational problems. In many cases, the underlying symmetry property can be exploited to reformulate the problem as a nondegenerate optimization problem on an embedded or quotient manifold associated with the original matrix representation of the search space. These constraint sets carry the structure of nonlinear matrix manifolds. This book provides the tools to exploit such structure in order to develop efficient matrix algorithms in the underlying total vector space.

Working with a search space that carries the structure of a nonlinear manifold introduces certain challenges in the algorithm implementation. In their classical formulation, iterative optimization algorithms rely heavily on the Euclidean vector space structure of the search space; a new iterate is generated by adding an update increment to the previous iterate in order to reduce the cost function. The update direction and step size are generally computed using a local model of the cost function, typically based on (approximate) first and second derivatives of the cost function, at each step. In order to define algorithms on manifolds, these operations must be translated into the language of differential geometry. This process is a significant research program that builds upon solid mathematical foundations. Advances in that direction have been dramatic over the last two decades and have led to a solid conceptual framework. However, generalizing a given optimization algorithm on an abstract manifold is only the first step towards the objective of this book. Turning the algorithm into an efficient numerical procedure is a second step that ultimately justifies or invalidates the first part of the effort. At the time of publishing this book, the second step is more an art than a theory.

Good algorithms result from the combination of insight from differential geometry, optimization, and numerical analysis. A distinctive feature of this book is that as much attention is paid to the practical implementation of the algorithm as to its geometric formulation. In particular, the concrete aspects of algorithm design are formalized with the help of the concepts of retraction and vector transport, which are relaxations of the classical geometric concepts of motion along geodesics and parallel transport. The proposed approach provides a framework to optimize the efficiency of the numerical algorithms while retaining the convergence properties of their abstract geometric counterparts.

The geometric material in the book is mostly confined to Chapters 3 and 5. Chapter 3 presents an introduction to Riemannian manifolds and tangent spaces that provides the necessary tools to tackle simple gradient descent optimization algorithms on matrix manifolds. Chapter 5 covers the advanced material needed to define higher order derivatives on manifolds and to build the analog of first and second order local models required in most optimization algorithms. The development provided in these chapters ranges from the foundations of differential geometry to advanced material relevant to our applications. The selected material focuses on those geometric concepts that are particular to the development of numerical algorithms on embedded and quotient manifolds. Not all aspects of classical differential geometry are covered, and some emphasis is placed on material that is nonstandard or difficult to find in the established literature. A newcomer to the field of differential geometry may wish to supplement this material with a classical text. Suggestions for excellent texts are provided in the references.

A fundamental, but deliberate, omission in the book is a treatment of the geometric structure of Lie groups and homogeneous spaces. Lie theory is derived from the concepts of symmetry and seems to be a natural part of a treatise such as this. However, with the purpose of reaching a community without an extensive background in geometry, we have omitted this material in the present book. Occasionally the Lietheoretic approach provides an elegant shortcut or interpretation for the problems considered. An effort is made throughout the book to refer the reader to the relevant literature whenever appropriate.

The algorithmic material of the book is interlaced with the geometric material. Chapter 4 considers gradientdescent linesearch algorithms. These simple optimization algorithms provide an excellent framework within which to study the important issues associated with the implementation of practical algorithms. The concept of retraction is introduced in Chapter 4 as a key step in developing efficient numerical algorithms on matrix manifolds. The later chapters on algorithms provide the core results of the book: the development of Newton based methods in Chapter 6 and of trust region methods in Chapter 7, and a survey of other superlinear methods such as conjugate gradients in Chapter 8. We attempt to provide a generic development of each of these methods, building upon the material of the geometric chapters. The methodology is then developed into concrete numerical algorithms on specific examples. In the analysis of superlinear and second order methods, the concept of vector transport (introduced in Chapter 8) is used to provide an efficient implementation of methods such as conjugate gradient and other quasi Newton methods. The algorithms obtained in these sections of the book are competitive with state of the art numerical linear algebra algorithms for certain problems.

The running example used throughout the book is the calculation of invariant subspaces of a matrix (and the many variants of this problem). This example is by far, for variants of algorithms developed within the proposed framework, the problem with the broadest scope of applications and the highest degree of achievement to date. Numerical algorithms, based on a geometric formulation, have been developed that compete with the best available algorithms for certain classes of invariant subspace problems. These algorithms are explicitly described in the later chapters of the book and, in part, motivate the whole project. Because of the important role of this class of problems within the book, the first part of Chapter 2 provides a detailed description of the invariant subspace problem, explaining why and how this problem leads naturally to an optimization problem on a matrix manifold. The second part of Chapter 2 presents other applications that can be recast as problems of the same nature. These problems are the subject of ongoing research, and the brief exposition given is primarily an invitation for interested researchers to join with us in investigating these problems and expanding the range of applications considered.

The book should primarily be considered a research monograph, as it reports on recently published results in an active research area that is expected to develop significantly beyond the material presented here. At the same time, every possible effort has been made to make the book accessible to the broadest audience, including applied mathematicians, engineers, and computer scientists with little or no background in differential geometry. It could equally well qualify as a graduate textbook for a one semester course in advanced optimization. More advanced sections that can be readily skipped at a first reading are indicated with a star. Moreover, readers are encouraged to visit the book home page where supplementary material is available.

The book is an extension of the first author's Ph.D. thesis [Abs03], itself a project that drew heavily on the material of the second author's Ph.D. thesis [Mah94]. It would not have been possible without the many contributions of a quickly expanding research community that has been working in the area over the last decade. The Notes and References section at the end of each chapter is an attempt to give proper credit to the many contributors, even though this task becomes increasingly difficult for recent contributions. The authors apologize for any omission or error in these notes. In addition, we wish to conclude this introductory chapter with special acknowledgements to people without whom this project would have been impossible. The 1994 monograph [HM94] by Uwe Helmke and John Moore is a milestone in the formulation of computational problems as optimization algorithms on manifolds and has had a profound influence on the authors. On the numerical side, the constant encouragement of Paul Van Dooren and Kyle Gallivan has provided tremendous support to our efforts to reconcile the perspectives of differential geometry and numerical linear algebra. We are also grateful to all our colleagues and friends over the last ten years who have crossed paths as coauthors, reviewers, and critics of our work. Special thanks to Ben Andrews, Chris Baker, Alan Edelman, Michiel Hochstenbach, Knut Hüper, Jonathan Manton, Robert Orsi, and Jochen Trumpf. Finally, we acknowledge the useful feedback of many students on preliminary versions of the book, in particular, Mariya Ishteva, Michel Journée, and Alain Sarlette.

(Continues...)



Excerpted from Optimization Algorithm on Matrix Manifolds by P.-A. Absil R. Mahony R. Sepulchre
Copyright © 2007 by Princeton University Press. Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

List of Algorithms xi

Foreword, by Paul Van Dooren xiii

Notation Conventions xv





Chapter 1. Introduction 1

Chapter 2. Motivation and Applications 5

2.1 A case study: the eigenvalue problem 5

2.1.1 The eigenvalue problem as an optimization problem 7

2.1.2 Some benefits of an optimization framework 9

2.2 Research problems 10

2.2.1 Singular value problem 10

2.2.2 Matrix approximations 12

2.2.3 Independent component analysis 13

2.2.4 Pose estimation and motion recovery 14

2.3 Notes and references 16





Chapter 3. Matrix Manifolds: First-Order Geometry 17

3.1 Manifolds 18

3.1.1 Definitions: charts, atlases, manifolds 18

3.1.2 The topology of a manifold* 20

3.1.3 How to recognize a manifold 21

3.1.4 Vector spaces as manifolds 22

3.1.5 The manifolds Rn x p and Rn x p 22

3.1.6 Product manifolds 23

3.2 Differentiable functions 24

3.2.1 Immersions and submersions 24

3.3 Embedded submanifolds 25

3.3.1 General theory 25

3.3.2 The Stiefel manifold 26

3.4 Quotient manifolds 27

3.4.1 Theory of quotient manifolds 27

3.4.2 Functions on quotient manifolds 29

3.4.3 The real projective space RPn x 1 30

3.4.4 The Grassmann manifold Grass(p, n) 30

3.5 Tangent vectors and differential maps 32

3.5.1 Tangent vectors 33

3.5.2 Tangent vectors to a vector space 35

3.5.3 Tangent bundle 36

3.5.4 Vector fields 36

3.5.5 Tangent vectors as derivations? 37

3.5.6 Differential of a mapping 38

3.5.7 Tangent vectors to embedded submanifolds 39

3.5.8 Tangent vectors to quotient manifolds 42

3.6 Riemannian metric, distance, and gradients 45

3.6.1 Riemannian submanifolds 47

3.6.2 Riemannian quotient manifolds 48

3.7 Notes and references 51





Chapter 4. Line-Search Algorithms on Manifolds 54

4.1 Retractions 54

4.1.1 Retractions on embedded submanifolds 56

4.1.2 Retractions on quotient manifolds 59

4.1.3 Retractions and local coordinates* 61

4.2 Line-search methods 62

4.3 Convergence analysis 63

4.3.1 Convergence on manifolds 63

4.3.2 A topological curiosity* 64

4.3.3 Convergence of line-search methods 65

4.4 Stability of fixed points 66

4.5 Speed of convergence 68

4.5.1 Order of convergence 68

4.5.2 Rate of convergence of line-search methods* 70

4.6 Rayleigh quotient minimization on the sphere 73

4.6.1 Cost function and gradient calculation 74

4.6.2 Critical points of the Rayleigh quotient 74

4.6.3 Armijo line search 76

4.6.4 Exact line search 78

4.6.5 Accelerated line search: locally optimal conjugate gradient 78

4.6.6 Links with the power method and inverse iteration 78

4.7 Refining eigenvector estimates 80

4.8 Brockett cost function on the Stiefel manifold 80

4.8.1 Cost function and search direction 80

4.8.2 Critical points 81

4.9 Rayleigh quotient minimization on the Grassmann manifold 83

4.9.1 Cost function and gradient calculation 83

4.9.2 Line-search algorithm 85

4.10 Notes and references 86





Chapter 5. Matrix Manifolds: Second-Order Geometry 91

5.1 Newton's method in Rn 91

5.2 Affine connections 93

5.3 Riemannian connection 96

5.3.1 Symmetric connections 96

5.3.2 Definition of the Riemannian connection 97

5.3.3 Riemannian connection on Riemannian submanifolds 98

5.3.4 Riemannian connection on quotient manifolds 100

5.4 Geodesics, exponential mapping, and parallel translation 101

5.5 Riemannian Hessian operator 104

5.6 Second covariant derivative* 108

5.7 Notes and references 110





Chapter 6. Newton's Method 111

6.1 Newton's method on manifolds 111

6.2 Riemannian Newton method for real-valued functions 113

6.3 Local convergence 114

6.3.1 Calculus approach to local convergence analysis 117

6.4 Rayleigh quotient algorithms 118

6.4.1 Rayleigh quotient on the sphere 118

6.4.2 Rayleigh quotient on the Grassmann manifold 120

6.4.3 Generalized eigenvalue problem 121

6.4.4 The nonsymmetric eigenvalue problem 125

6.4.5 Newton with subspace acceleration: Jacobi-Davidson 126

6.5 Analysis of Rayleigh quotient algorithms 128

6.5.1 Convergence analysis 128

6.5.2 Numerical implementation 129

6.6 Notes and references 131





Chapter 7. Trust-Region Methods 136

7.1 Models 137

7.1.1 Models in Rn 137

7.1.2 Models in general Euclidean spaces 137

7.1.3 Models on Riemannian manifolds 138

7.2 Trust-region methods 140

7.2.1 Trust-region methods in Rn 140

7.2.2 Trust-region methods on Riemannian manifolds 140

7.3 Computing a trust-region step 141

7.3.1 Computing a nearly exact solution 142

7.3.2 Improving on the Cauchy point 143

7.4 Convergence analysis 145

7.4.1 Global convergence 145

7.4.2 Local convergence 152

7.4.3 Discussion 158

7.5 Applications 159

7.5.1 Checklist 159

7.5.2 Symmetric eigenvalue decomposition 160

7.5.3 Computing an extreme eigenspace 161

7.6 Notes and references 165





Chapter 8. A Constellation of Superlinear Algorithms 168

8.1 Vector transport 168

8.1.1 Vector transport and affine connections 170

8.1.2 Vector transport by differentiated retraction 172

8.1.3 Vector transport on Riemannian submanifolds 174

8.1.4 Vector transport on quotient manifolds 174

8.2 Approximate Newton methods 175

8.2.1 Finite difference approximations 176

8.2.2 Secant methods 178

8.3 Conjugate gradients 180

8.3.1 Application: Rayleigh quotient minimization 183

8.4 Least-square methods 184

8.4.1 Gauss-Newton methods 186

8.4.2 Levenberg-Marquardt methods 187

8.5 Notes and references 188





A. Elements of Linear Algebra, Topology, and Calculus 189

A.1 Linear algebra 189

A.2 Topology 191

A.3 Functions 193

A.4 Asymptotic notation 194

A.5 Derivatives 195

A.6 Taylor's formula 198





Bibliography 201

Index 221


What People are Saying About This

Gallivan

The treatment strikes an appropriate balance between mathematical, numerical, and algorithmic points of view. The quality of the writing is quite high and very readable. The topic is very timely and is certainly of interest to myself and my students.
Kyle A. Gallivan, Florida State University

From the Publisher

"The treatment strikes an appropriate balance between mathematical, numerical, and algorithmic points of view. The quality of the writing is quite high and very readable. The topic is very timely and is certainly of interest to myself and my students."—Kyle A. Gallivan, Florida State University

From the B&N Reads Blog

Customer Reviews