![Transfer Learning for Natural Language Processing](http://img.images-bn.com/static/redesign/srcs/images/grey-box.png?v11.9.4)
![Transfer Learning for Natural Language Processing](http://img.images-bn.com/static/redesign/srcs/images/grey-box.png?v11.9.4)
eBook
Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
Related collections and offers
Overview
Summary
In Transfer Learning for Natural Language Processing you will learn:
Fine tuning pretrained models with new domain data
Picking the right model to reduce resource usage
Transfer learning for neural network architectures
Generating text with generative pretrained transformers
Cross-lingual transfer learning with BERT
Foundations for exploring NLP academic literature
Training deep learning NLP models from scratch is costly, time-consuming, and requires massive amounts of data. In Transfer Learning for Natural Language Processing, DARPA researcher Paul Azunre reveals cutting-edge transfer learning techniques that apply customizable pretrained models to your own NLP architectures. You’ll learn how to use transfer learning to deliver state-of-the-art results for language comprehension, even when working with limited label data. Best of all, you’ll save on training time and computational costs.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology
Build custom NLP models in record time, even with limited datasets! Transfer learning is a machine learning technique for adapting pretrained machine learning models to solve specialized problems. This powerful approach has revolutionized natural language processing, driving improvements in machine translation, business analytics, and natural language generation.
About the book
Transfer Learning for Natural Language Processing teaches you to create powerful NLP solutions quickly by building on existing pretrained models. This instantly useful book provides crystal-clear explanations of the concepts you need to grok transfer learning along with hands-on examples so you can practice your new skills immediately. As you go, you’ll apply state-of-the-art transfer learning methods to create a spam email classifier, a fact checker, and more real-world applications.
What's inside
Fine tuning pretrained models with new domain data
Picking the right model to reduce resource use
Transfer learning for neural network architectures
Generating text with pretrained transformers
About the reader
For machine learning engineers and data scientists with some experience in NLP.
About the author
Paul Azunre holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA research programs.
Table of Contents
PART 1 INTRODUCTION AND OVERVIEW
1 What is transfer learning?
2 Getting started with baselines: Data preprocessing
3 Getting started with baselines: Benchmarking and optimization
PART 2 SHALLOW TRANSFER LEARNING AND DEEP TRANSFER LEARNING WITH RECURRENT NEURAL NETWORKS (RNNS)
4 Shallow transfer learning for NLP
5 Preprocessing data for recurrent neural network deep transfer learning experiments
6 Deep transfer learning for NLP with recurrent neural networks
PART 3 DEEP TRANSFER LEARNING WITH TRANSFORMERS AND ADAPTATION STRATEGIES
7 Deep transfer learning for NLP with the transformer and GPT
8 Deep transfer learning for NLP with BERT and multilingual BERT
9 ULMFiT and knowledge distillation adaptation strategies
10 ALBERT, adapters, and multitask adaptation strategies
11 Conclusions
Product Details
ISBN-13: | 9781638350996 |
---|---|
Publisher: | Manning |
Publication date: | 08/31/2021 |
Sold by: | SIMON & SCHUSTER |
Format: | eBook |
Pages: | 272 |
File size: | 6 MB |
About the Author
Table of Contents
Preface xi
Acknowledgments xiii
About this book xv
About the author xix
About the cover illustration xx
Part 1 Introduction and Overview 1
1 What is transfer learning? 3
1.1 Overview of representative NLP tasks 5
1.2 Understanding NLP in the context of AI 7
Artificial intelligence (AI) 8
Machine learning 8
Natural language processing (NLP) 12
1.3 A brief history of NLP advances 14
General overview 14
Recent transfer learning advances 16
1.4 Transfer learning in computer vision 18
General overview 18
Pretrained ImageNet models 19
Fine-tuning pretrained ImageNet models 20
1.5 Why is NLP transfer learning an exciting topic to study now? 21
2 Getting started with baselines: Data preprocessing 24
2.1 Preprocessing email spam classification example data 27
Loading and visualizing the Enron corpus 28
Loading and visualizing the fraudulent email corpus 30
Converting the email text into numbers 34
2.2 Preprocessing movie sentiment classification example data 37
2.3 Generalized linear models 39
Logistic regression 40
Support vector machines (SVMs) 42
3 Getting started with baselines: Benchmarking and optimization 44
3.1 Decision-tree-based models 45
Random forests (RFs) 45
Gradient-boosting machines (GBMs) 46
3.2 Neural network models 50
Embeddings from Language Models (ELMo) 51
Bidirectional Encoder Representations from Transformers (BERT) 56
3.3 Optimizing performance 59
Manual hyperparameter tuning 60
Systematic hyperparameter tuning 61
Part 2 Shallow Transfer Learning and Deep Transfer Learning with Recurrent Neural Networks (RNNs) 65
4 Shallow transfer learning for NLP 67
4.1 Semisupervised learning with pretrained word embeddings 70
4.2 Semisupervised learning with higher-level representations 75
4.3 Multitask learning 76
Problem setup and a shallow neural single-task baseline 78
Dual-task experiment 80
4.4 Domain adaptation 81
5 Preprocessing data for recurrent neural network deep transfer learning experiments 86
5.1 Preprocessing tabular column-type classification data 89
Obtaining and visualizing tabular data 90
Preprocessing tabular data 93
Encoding pre-processed data as numbers 95
5.2 Preprocessing fact-checking example data 96
Special problem considerations 96
Loading and visualizing fact-checking data 97
6 Deep transfer learning for NLP with recurrent neural networks 99
6.1 Semantic Inference for the Modeling of Ontologies (SIMOn) 100
General neural architecture overview 101
Modeling tabular data 102
Application of SIMOn to tabular column-type classification data 102
6.2 Embeddings from Language Models (ELMo) 110
ELMo bidirectional language modeling 111
Application to fake news detection 112
6.3 Universal Language Model Fine-Tuning (ULMFiT) 114
Target task language model fine-tuning 115
Target task classifier fine-tuning 116
Part 3 Deep Transfer Learning with Transformers and Adaptation Strategies 119
7 Deep transfer learning for NLP with the transformer and GPT 121
7.1 The transformer 123
An introduction to the transformers library and attention visualization 126
Self-attention 128
Residual connections, encoder-decoder attention, and positional encoding 132
Application of pretrained encoder-decoder to translation 134
7.2 The Generative Pretrained Transformer 136
Architecture overview 137
Transformers pipelines introduction and application to text generation 140
Application to chatbots 141
8 Deep transfer learning for NLP with BERT and multilingual BERT 145
8.1 Bidirectional Encoder Representations from Transformers (BERT) 146
Model architecture 148
Application to question answering 151
Application to Jill in the blanks and next-sentence prediction tasks 154
8.2 Cross-lingual learning with multilingual BERT (mBERT) 156
Brief JW300 dataset overview 157
Transfer mBERT to monolingual Twi data with the pretrained tokenizer 158
mBERT and tokenizer trained from scratch on monolingual Twi data 160
9 ULMFiT and knowledge distillation adaptation strategies 162
9.1 Gradual unfreezing and discriminative fine-tuning 163
Pretrained language model fine-tuning 165
Target task classifier fine-tuning 168
9.2 Knowledge distillation 170
Transfer DistilmBERT to monolingual Twi data with pretrained tokenizer 172
10 ALBERT, adapters, and multitask adaptation strategies 177
10.1 Embedding factorization and cross-layer parameter sharing 179
Fine-tuning pretrained ALBERT on MDSD book reviews 180
10.2 Multitask fine-tuning 183
General Language Understanding Dataset (GLUE) 184
Fine-tuning on a single GLUE task 186
Sequential adaptation 188
10.3 Adapters 191
11 Conclusions 195
11.1 Overview of key concepts 196
11.2 Other emerging research trends 203
RoBERTa 203
GPT-3 203
XLNet 205
BigBird 206
Longformer 206
Reformer 206
T5 207
BART 208
XLM 209
TAPAS 209
11.3 Future of transfer learning in NLP 210
11.4 Ethical and environmental considerations 212
11.5 Staying up-to-date 214
Kaggle and Zindi competitions 214
arXiv 215
News and social media (Twitter) 215
11.6 Final words 216
Appendix A Kaggle primer 218
Appendix B Introduction to fundamental deep learning tools 228
Index 237