Modularizing Deep Learning via Pairwise Learning With Kernels

IEEE Transactions on Neural Networks and Learning Systems, 2021

Shiyu Duan, Shujian Yu, Jose C. Principe, IEEE Transactions on Neural Networks and Learning Systems, 2021 [pdf][code]

TL;DR: Using a simple trick, we reveal the kernel machines hidden inside your favorite neural networks. Based on this observation, we propose a provably optimal modular training framework for neural networks in classification, making possible fully modular deep learning workflows. Our training method does not need between-module propagation and relies almost completely on weak pairwise labels yet still matches end-to-end backpropagation in accuracy. Finally, we demonstrate that a modular workflow naturally provides simple but reliable solutions to long-standing problems in important domains such as transfer learning.

 

Abstract By redefining the conventional notions of layers, we present an alternative view on finitely wide, fully trainable deep neural networks as stacked linear models in feature spaces, leading to a kernel machine interpretation. Based on this construction, we then propose a provably optimal modular learning framework for classification that does not require between-module backpropagation. This modular approach brings new insights into the label requirement of deep learning: It leverages only implicit pairwise labels (weak supervision) when learning the hidden modules. When training the output module, on the other hand, it requires full supervision but achieves high label efficiency, needing as few as 10 randomly selected labeled examples (one from each class) to achieve 94.88% accuracy on CIFAR-10 using a ResNet-18 backbone. Moreover, modular training enables fully modularized deep learning workflows, which then simplify the design and implementation of pipelines and improve the maintainability and reusability of models. To showcase the advantages of such a modularized workflow, we describe a simple yet reliable method for estimating reusability of pre-trained modules as well as task transferability in a transfer learning setting. At practically no computation overhead, it precisely described the task space structure of 15 binary classification tasks from CIFAR-10.

 

BibTeX

@article{duan2021modularizing,
  title={Modularizing deep learning via pairwise learning with kernels},
  author={Duan, Shiyu and Yu, Shujian and Pr{\'\i}ncipe, Jos{\'e} C},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  volume={33},
  number={4},
  pages={1441--1451},
  year={2021},
  publisher={IEEE}
}