Modern Deep Learning Design and Applications

Versatile Tools to Solve Deep Learning Problems

Springer Link Barnes & Noble Amazon


Left: book cover, featuring mutant houseplant gone wild. Right: Little Mishi is reading about meta-optimization while comfortably nested in bed.


Table of contents
  1. Pitch and Introduction
  2. Chapters
  3. Case Study Papers
  4. Code

Pitch and Introduction

In spring of 2020, I was approached by Celestin Suresh John, acquisition editor of machine learning topics at Apress/Springer Nature, with an offer to write a book. I wrote up a book proposal in about two weeks, which was approved by the Apress editorial board. I spent the summer working several hours a day on the book - writing content, running and organizing code, producing visualizations, emailing paper authors, responding to revisions and comments by reviewers. The result: a 7-chapter, 450-page book that - in my opinion, anyway - takes a novel, underrepresented perspective on modern deep learning developments.

Much has been written about deep learning, most of it adhering to what I call a code-centric framework. These resources, which encompass all sorts of mediums from websites to books to online courses, center concepts around code. The reason, I suspect, is largely because software is one of the most accessible tools and hence computer science learners often expect ‘hands-on’ experience without too much theory. However, students that develop an understanding of deep learning through code-centric frameworks restrictively tie their understanding of deep learning concepts to code. While this makes sense from a software engineering perspective (think: data structures, memory management, etc. - tied inherently to implementation), deep learning is a combination of mathematics/statistics and computer science. There is an inherent abstraction to deep learning that code-centric frameworks often fail to explore fully, and thus limits learners’ innovative/creative capacities to engineer novel deep learning solutions.

I think of this in terms of the bias/variance paradigm introduced early to machine learning students. Consider a simple curve-fitting model: if the model has high variance, it bases its understanding of the phenomena almost completely on the known data points. The curve passes through every point, but it doesn’t model the regions “in-between” (i.e. “unseen” values not in the dataset) well. This is the sort of learning I believe too strict a code-centric framework encourages. In my book, I attempt to encourage generalization in learning by emphasizing intuitive theory as a guiding framework for engineering deep learning solutions and demonstrating the versatility and freedom of deep learning implementation tools.

Broadly, my book is a documentation and organization of more recent deep learning topics that have not made it yet into the bulk of “standard” deep learning literature. Topics include self-supervised learning, model compression (pruning, quantization, weight sharing, collaborative optimization), Bayesian optimization applications to neural network design, Neural Architecture Search, and architecture design motifs.

Read the introduction to the book (pages I - XIX) here.


Chapters

Please email me (andreye@uw.edu) for book, chapter, or page-range requests.

The book is organized by the following outline of topics (subsections not listed):

  • xvii Introduction
  • 001 Chapter 1: “A Deep Dive into Keras”
    • 002 Why Keras?
    • 003 Installing and Importing Keras
    • 004 The Simple Keras Workflow
    • 030 Visualizing Model Architectures
    • 033 Functional API
    • 041 Dealing with Data
    • 047 Key Points
  • 049 Chapter 2: “Pretraining Strategies and Transfer Learning”
    • 050 Developing Creative Training Structures
    • 065 Transfer Learning Practical Theory
    • 081 Implementing Transfer Learning
    • 091 Implementing Simple Self-Supervised Learning
    • 095 Case Studies
    • 112 Key Points
  • 115 Chapter 3: “The Versatility of Autoencoders”
    • 116 Autoencoder Intuition and Theory
    • 121 The Design of Autoencoder Implementation
    • 145 Autoencoder Applications (Denoising, Pretraining, VAE, etc.)
    • 188 Case Studies
    • 201 Key Points
  • 205 Chapter 4: “Model Compression for Practical Deployment”
    • 206 Introduction to Model Compression
    • 210 Pruning
    • 229 Quantization
    • 236 Weight Clustering
    • 240 Collaborative Optimization
    • 248 Case Studies
    • 257 Key Points
  • 259 Chapter 5: “Automating Model Design with Meta-Optimization”
    • 260 Introduction to Meta-Optimization
    • 264 General Hyperparameter Optimization
    • 289 Neural Architecture Search
    • 311 Case Studies
    • 323 Key Points
  • 327 Chapter 6: “Successful Neural Network Architecture Design”
    • 330 Nonlinear and Parallel Representation
    • 357 Block/Cell Design
    • 380 Neural Network Scaling
    • 399 Key Points
  • 401 Chapter 7: “Reframing Difficult Deep Learning Problems”
    • 403 Reframing Data Representation - DeepInsight
    • 414 Reframing Corrupted Data Usage - NLNL
    • 427 Reframing Limited Data Usage - Siamese Networks
    • 438 Key Points and Epilogue
  • 441 Index

Case Study Papers

In an effort to ground the book and to further illuminate the wide breadth of concept applications, from the second chapter onwards each chapter features three case studies. A case study centers around a paper relevant to the chapter discussion, and gives context, a summary of the paper contributions and concepts, presents reported results and diagrams, and offers code if applicable/feasible.

Combined, the book explores 18 different papers. It was a great experience reaching out to the authors of each of these papers (even the ones that didn’t reply to my request - I’m looking at you, Vivek Ramanujan; it is a pity your fascinating paper was left out). I’ve organized and linked the discussed papers below for your reference and exploration.

Chapter 2 - “Pretraining Strategies and Transfer Learning”

Chapter 3 - “The Versatility of Autoencoders”

Chapter 4 - “Model Compression for Practical Deployment”

Chapter 5 - “Automating Model Design with Meta-Optimization”

Chapter 6 - “Successful Neural Network Architecture Design”

Chapter 7 - “Reframing Difficult Deep Learning Problems


Code

The code snippets within each chapter have been arranged into code notebooks for easy user viewing and access. The raw notebooks for each chapter are available on the Apress GitHub.