Introduction to Machine Learning with R: Rigorous Mathematical Analysis

Introduction to Machine Learning with R: Rigorous Mathematical Analysis

by Scott Burger
Introduction to Machine Learning with R: Rigorous Mathematical Analysis

Introduction to Machine Learning with R: Rigorous Mathematical Analysis

by Scott Burger

Paperback

$55.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Machine learning is an intimidating subject until you know the fundamentals. If you understand basic coding concepts, this introductory guide will help you gain a solid foundation in machine learning principles. Using the R programming language, you’ll first start to learn with regression modelling and then move into more advanced topics such as neural networks and tree-based methods.

Finally, you’ll delve into the frontier of machine learning, using the caret package in R. Once you develop a familiarity with topics such as the difference between regression and classification models, you’ll be able to solve an array of machine learning problems. Author Scott V. Burger provides several examples to help you build a working knowledge of machine learning.

  • Explore machine learning models, algorithms, and data training
  • Understand machine learning algorithms for supervised and unsupervised cases
  • Examine statistical concepts for designing data for use in models
  • Dive into linear regression models used in business and science
  • Use single-layer and multilayer neural networks for calculating outcomes
  • Look at how tree-based models work, including popular decision trees
  • Get a comprehensive view of the machine learning ecosystem in R
  • Explore the powerhouse of tools available in R’s caret package

Product Details

ISBN-13: 9781491976449
Publisher: O'Reilly Media, Incorporated
Publication date: 04/02/2018
Pages: 223
Product dimensions: 7.00(w) x 9.10(h) x 0.60(d)

About the Author

Scott Burger is a senior data scientist living and working in Seattle. His programming experience comes from the realm of astrophysics, but he uses it in many different types of scenarios ranging from business intelligence to database optimizations. Scott has built a solid career on explaining terse scientific concepts to the general public and wants to use that expertise to shed light on the world of machine learning for the general R user.

Table of Contents

Preface vii

1 What Is a Model? 1

Algorithms Versus Models: What's the Difference? 6

A Note on Terminology 7

Modeling Limitations 8

Statistics and Computation in Modeling 10

Data Training 11

Cross-Validation 12

Why Use R? 13

The Good 13

R and Machine Learning 15

The Bad 16

Summary 17

2 Supervised and Unsupervised Machine Learning 19

Supervised Models 20

Regression 20

Training and Testing of Data 22

Classification 24

Logistic Regression 24

Supervised Clustering Methods 26

Mixed Methods 31

Tree-Based Models 31

Random Forests 34

Neural Networks 35

Support Vector Machines 39

Unsupervised Learning 40

Unsupervised Clustering Methods 41

Summary 43

3 Sampling Statistics and Model Training in R 45

Bias 46

Sampling in R 51

Training and Testing 54

Roles of Training and Test Sets 55

Why Make a Test Set? 55

Training and Test Sets: Regression Modeling 55

Training and Test Sets: Classification Modeling 63

Cross-Validation 67

k-Fold Cross-Validation 67

Summary 69

4 Regression in a Nutshell 71

Linear Regression 72

Multivariate Regression 74

Regularization 78

Polynomial Regression 81

Goodness of Fit with Data-The Perils of Overfitting 87

Root-Mean-Square Error 87

Model Simplicity and Goodness of Fit 89

Logistic Regression 91

The Motivation for Classification 92

The Decision Boundary 93

The Sigmoid Function 94

Binary Classification 98

Multiclass Classification 101

Logistic Regression with Caret 105

Summary 106

Linear Regression 106

Logistic Regression 107

5 Neural Networks in a Nutshell 109

Single-Layer Neural Networks 109

Building a Simple Neural Network by Using R 111

Multiple Compute Outputs 113

Hidden Compute Nodes 114

Multilayer Neural Networks 120

Neural Networks for Regression 125

Neural Networks for Classification 130

Neural Networks with caret 131

Regression 131

Classification 132

Summary 133

6 Tree-Based Methods 135

A Simple Tree Model 135

Deciding How to Split Trees 138

Tree Entropy and Information Gain 139

Pros and Cons of Decision Trees 140

Tree Overfitting 141

Pruning Trees 145

Decision Trees for Regression 151

Decision Trees for Classification 151

Conditional Inference Trees 152

Conditional Inference Tree Regression 154

Conditional Inference Tree Classification 155

Random Forests 155

Random Forest Regression 156

Random Forest Classification 157

Summary 158

7 Other Advanced Methods 159

Naive Bayes Classification 159

Bayesian Statistics in a Nutshell 159

Application of Naive Bayes 161

Principal Component Analysis 163

Linear Discriminant Analysis 169

Support Vector Machines 175

k-Nearest Neighbors 179

Regression Using kNN 181

Classification Using kNN 182

Summary 184

8 Machine Learning with the caret Package 185

The Titanic Dataset 186

Data Wrangling 187

Caret Unleashed 188

Imputation 188

Data Splitting 190

Caret Under the Hood 191

Model Training 194

Comparing Multiple caret Models 197

Summary 199

A Encyclopedia of Machine Learning Models in caret 201

Index 209

From the B&N Reads Blog

Customer Reviews