Abstract
As deep neural networks (DNNs) continue to be used on resource-limited edge devices with low latency requirements for interactive applications, there is a growing need to reduce inference time and energy consumption while maintaining acceptable prediction accuracy. In response, we introduce a novel framework, CAE-Net, for designing and training lightweight and energy-efficient deep neural networks (DNNs) for image classification on edge devices. The proposed framework consists of two parts: (1) a new Enhanced Converting Autoencoder that employs entropy-based intraclass clustering to learn the key image features by transforming the hard images into easy representative images, and (2) a composite lightweight CAE-Net classifier employing the pre-trained encoder of the Converting Autoencoder followed by a few classification layers from a baseline DNN trained using knowledge transfer. Unlike many state-of-the-art models, our experimental results using popular image-classification datasets, MNIST and CIFAR10 demonstrate that CAE-Net can satisfy the inference latency target of 10-20ms on Raspberry Pi and 5-10 ms on Nvidia Jetson Nano. Compared with the competing models meeting the SLO targets, CAE-Net achieves over 4-fold energy reduction and inferencing latency speedups on the CIFAR-10 dataset compared to AlexNet and its pruned/distilled variants and other DNNs on Raspberry Pi and about 6-fold on Jetson Nano while maintaining similar or higher accuracy.