Location: Food Quality Laboratory
Title: Addernet 2.0: Optimal addernet accelerator designs with activation-oriented quantization and fused bias removal-based memory optimizationAuthor
![]() |
ZHANG, YUNXIANG - Binghamton University |
![]() |
KAILANI, OMAR AL - Binghamton University |
![]() |
Zhou, Bin |
![]() |
ZHAO, WENFENG - Binghamton University |
|
Submitted to: IEEE Transactions on Circuits and Systems 1
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 2/3/2025 Publication Date: 2/13/2025 Citation: Zhang, Y., Kailani, O., Zhou, B., Zhao, W. 2025. Addernet 2.0: Optimal addernet accelerator designs with activation-oriented quantization and fused bias removal-based memory optimization. IEEE Transactions on Circuits and Systems 1. https://doi.org/10.1109/TCSI.2025.3539912. DOI: https://doi.org/10.1109/TCSI.2025.3539912 Interpretive Summary: The increasing demand for AI-driven technologies to evaluate and analyze produce quality within supply chains is currently hampered by the substantial energy consumption and hardware demands of traditional neural network models. USDA ARS Scientists, in collaboration with their partners, have developed an innovative algorithm-hardware co-design framework that significantly enhances overall energy efficiency and resource utilization. This breakthrough paves the way for deploying larger and more sophisticated neural network models on resource-constrained devices, thereby facilitating the creation of energy-efficient and cost-effective AI-powered produce quality monitoring platforms. This technology has the potential to reduce food waste, improve supply chain efficiency, and promote sustainable agricultural practices. Technical Abstract: Convolutional neural networks (CNNs) are computationally demanding due to expensive Multiply-ACcumulate (MAC) operations. Emerging neural network models, such as AdderNet, exploit efficient arithmetic alternatives like sum-of- absolute-difference (SAD) operations to replace the costly MAC operations, while still achieving competitive model accuracy as the CNN counterparts. Nevertheless, existing AdderNet accel- erators still face critical implementation challenges to achieve maximal hardware and energy efficiency at the cost of model inference accuracy loss. This paper presents AdderNet 2.0, an algorithm-hardware co-design framework featuring a novel Activation-Oriented Quantization (AOQ) strategy, a Fused Bias Removal (FBR) scheme for on-chip feature map memory bitwidth reduction, and optimal PE designs to improve the overall resource utilization towards optimal AdderNet accelerator designs. Multi- ple AdderNet 2.0 accelerator design variants were implemented on Xilinx KV-260 FPGA. Experimental results show that the INT6 AdderNet 2.0 accelerators achieve significant hardware resource and energy savings when compared to prior CNN and AdderNet designs. Experimental results show that INT6 AdderNet 2.0 achieves up to 3.78× DSP density improvement with almost the same throughput as compared to INT8 CNN design, making it possible to deploy large network models on resource-constrained devices. Furthermore, the AdderNet 2.0 achieves 19.8% LUT, 28.64% FF, and 5% BRAM savings compared to the baseline INT8 CNN design. |
