Event Streams in Action: Integrating and processing event streams

Event Streams in Action: Integrating and processing event streams

Event Streams in Action: Integrating and processing event streams

Event Streams in Action: Integrating and processing event streams

Paperback(1st Edition)

$44.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Summary

Event Streams in Action is a foundational book introducing the ULP paradigm and presenting techniques to use it effectively in data-rich environments.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the Technology

Many high-profile applications, like LinkedIn and Netflix, deliver nimble, responsive performance by reacting to user and system events as they occur. In large-scale systems, this requires efficiently monitoring, managing, and reacting to multiple event streams. Tools like Kafka, along with innovative patterns like unified log processing, help create a coherent data processing architecture for event-based applications.

About the Book

Event Streams in Action teaches you techniques for aggregating, storing, and processing event streams using the unified log processing pattern. In this hands-on guide, you'll discover important application designs like the lambda architecture, stream aggregation, and event reprocessing. You'll also explore scaling, resiliency, advanced stream patterns, and much more! By the time you're finished, you'll be designing large-scale data-driven applications that are easier to build, deploy, and maintain.

What's inside

  • Validating and monitoring event streams
  • Event analytics
  • Methods for event modeling
  • Examples using Apache Kafka and Amazon Kinesis

About the Reader

For readers with experience coding in Java, Scala, or Python.

About the Author

Alexander Dean developed Snowplow, an open source event processing and analytics platform. Valentin Crettaz is an independent IT consultant with 25 years of experience.

Table of Contents

  1. Introducing event streams
  2. The unified log 24
  3. Event stream processing with Apache Kafka
  4. Event stream processing with Amazon Kinesis
  5. Stateful stream processing
  6. Schemas
  7. Archiving events
  8. Railway-oriented processing
  9. Commands
  10. Analytics-on-read
  11. Analytics-on-write

Product Details

ISBN-13: 9781617292347
Publisher: Manning
Publication date: 05/30/2019
Edition description: 1st Edition
Pages: 344
Product dimensions: 7.30(w) x 9.10(h) x 0.80(d)

About the Author

Alexander Dean is co-founder and technical lead of Snowplow Analytics, an open source event processing and analytics platform.

Valentin Crettaz is an independent IT consultant who's been working for the past 25 years on many different challenging projects across the globe. His expertise ranges from software engineering and architecture to data science and business intelligence. His daily job boils down to leveraging the latest and most cutting-edge web, data, and streaming technologies to implement IT solutions that will help reduce the cultural gap between IT and business people.

Table of Contents

Preface xiii

Acknowledgments xiv

About this book xvi

About the authors xix

About the cover illustration xx

Part 1 Event Streams and Unified Logs 1

1 Introducing event streams 3

1.1 Defining our terms 4

Events 5

Continuous event streams 6

1.2 Exploring familiar event streams 7

Application-level logging 7

Web analytics 8

Publish/subscribe messaging 10

1.3 Unifying continuous event streams 12

The classic era 13

The hybrid era 16

The unified era 17

1.4 Introducing use cases for the unified log 19

Customer feedback loops 19

Holistic systems monitoring 21

Hot-swapping data application versions 22

2 The unified log 24

2.1 Understanding the anatomy of a unified log 25

Unified 25

Append-only 26

Distributed 27

Ordered 28

2.2 Introducing our application 29

Identifying our key events 30

Unified log, e-commerce style 31

Modeling our first event 32

2.3 Setting up our unified log 34

Downloading and installing Apache Kafka 34

Creating our stream 35

Sending and receiving events 36

3 Event stream processing with Apache Kafka 38

3.1 Event stream processing 101 39

Why process event streams? 39

Single-event processing 41

Multiple-event processing 42

3.2 Designing our first stream-processing app 42

Using Kafka as our company's glue 43

Locking down our requirements 44

3.3 Writing a simple Kafka worker 46

Setting up OUT development environment 46

Configuring our application 47

Reading from Kafka 49

Writing to Kafka 50

Stitching it all together 51

Testing 52

3.4 Writing a single-event processor 54

Writing our event processor 54

Updating our main function 56

Testing, redux 57

4 Event stream processing with Amazon Kinesis 60

4.1 Writing events to Kinesis 61

Systems monitoring and the unified log 61

Terminology differences from Kafka 63

Setting up our stream 64

Modeling our events 65

Writing our agent 66

4.2 Reading from Kinesis 72

Kinesis frameworks and SDKs 72

Reading events with the AWS CLI 73

Monitoring our stream with boto 79

5 Stateful stream processing 88

5.1 Detecting abandoned shopping carts 89

What management wants 89

Defining our algorithm 90

Introducing our derived events stream 91

5.2 Modeling our new events 92

Shopper adds item to cart 92

Shopper places order 93

Shopper abandons cart 93

5.3 Stateful stream processing 94

Introducing state management 94

Stream windowing 96

Stream processing frameworks and their capabilities 97

Stream processing frameworks 97

Choosing a stream processing framework for Nile 100

5.4 Detecting abandoned carts 101

Designing our Samza job 101

Preparing our project 102

Configuring our job 103

Writing our job's Java task 104

5.5 Running our Samza job 110

Introducing YARN 110

Submitting our job 111

Testing our job 112

Improving our job 113

Part 2 Data Engineering with Streams 115

6 Schemas 117

6.1 An introduction to schemas 118

Introducing Plum 118

Event schemas as contracts 120

Capabilities of schema technologies 121

Some schema technologies 123

Choosing a schema technology for Plum 125

6.2 Modeling our event in Avro 125

Setting up a development harness 126

Writing our health check event schema 127

From Avro to Java, and back again 129

Testing 131

6.3 Associating events with their schemas 132

Some modest proposals 132

A self-describing event for Plum 135

Plum's schema registry 137

7 Archiving events 140

7.1 The archivist's manifesto 141

Resilience 142

Reprocessing 143

Refinement 144

7.2 A design for archiving 146

What to archive 146

Where to archive 147

How to archive 148

7.3 Archiving Kafka with Secor 149

Warning up Kafka 150

Creating our event archive 152

Setting up Secor 153

7.4 Batch processing our archive 155

Batch processing 101 155

Designing our batch processing job 158

Writing our job in Apache Spark 159

Running our job on Elastic MapReduce 163

8 Railway-oriented processing 171

8.1 Leaving the happy path 172

Failure and Unix programs 172

Failure and Java 175

Failure and the log-industrial complex 178

8.2 Failure and the unified log 179

A design for failure 179

Modeling failures as events 181

Composing our happy path across jobs 183

8.3 Failure composition with Scalaz 184

Planning for failure 184

Setting up our Scala project 186

From Java to Scala 187

Better failure handling through Scalaz 189

Composing failures 191

8.4 Implementing railway-oriented processing 196

Introducing railway-oriented processing 196

Building the railway 199

9 Commands 208

9.1 Commands and the unified log 209

Events and commands 209

Implicit vs. explicit commands 210

Working with commands in a unified log 212

9.2 Making decisions 213

Introducing commands at Plum 213

Modeling commands 214

Writing our alert schema 216

Defining our alert schema 218

9.3 Consuming our commands 219

The right tool for the job 219

Reading our commands 220

Parsing our commands 221

Stitching it all together 224

Testing 224

9.4 Executing our commands 226

Signing up for Mailgun 226

Completing our executor 226

Final testing 230

9.5 Scaling up commands 231

One stream of commands, or many? 231

Handling command-execution failures 231

Command hierarchies 233

Part 3 Event Analytics 235

10 Analytics-on-read 237

10.1 Analytics-on-read, analytics-on-write 238

Analytics-on-read 238

Analytics-on-write 239

Choosing an approach 240

10.2 The OOPS event stream 242

Delivery truck events and entities 242

Delivery driver events and entities 243

The OOPS event model 243

The OOPS events archive 245

10.3 Getting started with Amazon Redshift 246

Introducing Redshift 246

Setting up Redshift 248

Designing an event warehouse 251

Creating our fat events table 255

10.4 ETL, ELT 256

Loading our events 256

Dimension widening 259

A detour on data volatility 263

10.5 Finally, some analysis 264

Analysis 1: Who does the most oil changes? 264

Analysis 2: Who is our most unreliable customer? 265

11 Analytics-on-write 268

11.1 Back to OOPS 269

Kinesis setup 269

Requirements gathering 271

Our analytics-on-write algorithm 272

11.2 Building our Lambda function 276

Setting up DynamoDB 276

Introduction to AWS Lambda 277

Lambda setup and event modeling 279

Revisiting our analytics-on-write algorithm 281

Conditional writes to DynarnoDB 286

Finalizing our Lambda 289

11.3 Running our Lambda function 290

Deploying our Lambda function 290

Testing our Lambda function 293

Appendix AWS primer 297

Index 309

From the B&N Reads Blog

Customer Reviews