Abstract: Artificial Intelligence (AI) based applications are increasingly being deployed on the edge of the network due to the faster response times required by many of these applications [1]. At the same time, low latency operation becomes important as real-time edge datasets may not be amenable to storage on the cloud. However, this leads to a larger number of processors that are more broadly distributed over the edge of the network. Power constraints on these edge processors or AI accelerators present a challenge since these devices will usually need to operate on battery power. This may also lead to an increase in the carbon footprint of AI systems as more AI accelerator networks are deployed.
AI applications typically consume more than 50% of the power in infrastructure devices or the edge of the network. Within these devices, the AI accelerator which performs AI inferences consumes a large amount of power (typically over 50%). Subsequently, reductions of both active and idle power of AI accelerators will significantly improve the energy efficiency of infrastructure devices and overall AI systems. Various innovative methods can be used to lower the active power including quantization, pruning, RMSE loss minimization, hybrid architectures which combine different types of AI accelerators, and sparsity.
A significant portion of the power consumption of AI accelerators is due to idle power caused by the scheduling and repeated wake-up of the AI accelerators to perform AI inference tasks [2]. Redundant carry-free designs and asynchronous designs can be used to lower the active power. After compiling, sparse-structured weights can be used to reduce memory access delay, a primary source of power and delay in AI inferences. In order to better analyze the power savings due to sleek operation, a power in the fabricated unit of workloads should be explored.
Keywords: Power Efficiency, Semiconductors, Edge AI, Low Power Design, Scalable Intelligence, Wireless Systems, Embedded AI, Edge Computing, AI Accelerators, Energy Efficiency, Real-Time Processing, IoT Devices, Neural Network Hardware, Smart Sensors, Signal Processing, AI Chips, 5G Integration, Tiny ML–, Hardware Optimization, System-on-Chip (SoC)