This talk will focus on improving the efficacy and effectiveness of deep self-supervised CNN
models for image classification. With the recent advancements of deep learning-based methods
in image classification, the requirement of a huge amount of training data is inevitable to avoid
overfitting problems. Moreover, supervised deep learning models require labeled datasets for
training. Preparing such a huge amount of labeled data takes a lot of human effort and time. In
this scenario, self-supervised models are becoming popular because of their ability to learn
even from unlabeled datasets. However, the efficient transfer of knowledge learned by
self-supervised models into a target task is an unsolved problem.
First, this talk will discuss a method for the efficient transfer of knowledge learned by a
self-supervised model, into a target task. The hyperparameters such as the number of layers,
number of units in each layer, learning rate, and dropout are automatically tuned in the Fully
Connected (FC) layers, using a Bayesian optimization technique called the Tree-structured
Parzen Estimator Approach (TPE) algorithm.
In the second problem, we extended the concept of automatically tuning (autotuning) the
hyperparameters (as proposed in the first problem), to apply on CNN layers. This work uses a
pre-trained self-supervised model to transfer knowledge for image classification. We propose an
efficient Bayesian optimization-based method for autotuning the hyperparameters of the
self-supervised model during the knowledge transfer. The proposed autotuned image classifier
consists of a few CNN layers followed by an FC layer. Finally, we use a softmax layer to obtain
the probability scores of classes for the images.
In the third problem, in addition to the challenges mentioned in the second problem, we further
focus on parameter overhead and GPU usage for hours. This work proposes a method to
address the three challenges for an image classification task, namely the limited availability of
labeled datasets for training, parameter overhead, and GPU usage for extended periods during
training. We introduce an improved transfer learning approach for a target data set, in which we
take the learned features from a self-supervised model after minimizing its parameters by
removing the final layer. The learned features are then fed into a CNN classifier, followed by a
multi-layer perceptron (MLP), where the hyperparameters of both the CNN and MLP are
automatically tuned (autotuned) using a Bayesian optimization based technique. Further, we
reduce the GFLOPs measure by limiting the search space for the hyperparameters, not
compromising with the performance.
Our ongoing work aims to propose an approach for transferring knowledge from a larger
self-supervised model to a smaller model through a knowledge distillation task. The approach
includes hyperparameter tuning of the MLP classifier in the student model, as well as the
balancing factor and temperature, which play a crucial role in the knowledge distillation process.
To enhance the effectiveness of the knowledge distillation, the ongoing work aims to introduce a
loss function which is a linear combination of the hard target loss function, soft target loss
function, and Barlow Twins loss. This approach is expected to improve the accuracy and
efficiency of the knowledge transfer process.
In the future, we plan to prune CNN layers and filters from the self-supervised model which are
not contributing to the training process.