Efficient On-Chip Learning for Optical Neural Networks Through Power-Aware Sparse Zeroth-Order Optimization

Authors

  • Jiaqi Gu University of Texas at Austin
  • Chenghao Feng University of Texas at Austin
  • Zheng Zhao Synopsys, Inc.
  • Zhoufeng Ying Alpine Optoelectronics, Inc.
  • Ray T. Chen University of Texas at Austin
  • David Z. Pan University of Texas at Austin

Keywords:

Learning on the Edge & Model Compression, Optimization

Abstract

Optical neural networks (ONNs) have demonstrated record-breaking potential in high-performance neuromorphic computing due to their ultra-high execution speed and low energy consumption. However, current learning protocols fail to provide scalable and efficient solutions to photonic circuit optimization in practical applications. In this work, we propose a novel on-chip learning framework to release the full potential of ONNs for power-efficient in situ training. Instead of deploying implementation-costly back-propagation, we directly optimize the device configurations with computation budgets and power constraints. We are the first to model the ONN on-chip learning as a resource-constrained stochastic noisy zeroth-order optimization problem, and propose a novel mixed-training strategy with two-level sparsity and power-aware dynamic pruning to offer a scalable on-chip training solution in practical ONN deployment. Compared with previous methods, we are the first to optimize over 2,500 optical components on chip. We can achieve much better optimization stability, 3.7x-7.6x higher efficiency, and save >90% power under practical device variations and thermal crosstalk.

Downloads

Published

2021-05-18

How to Cite

Gu, J., Feng, C., Zhao, Z., Ying, Z., Chen, R. T., & Pan, D. Z. (2021). Efficient On-Chip Learning for Optical Neural Networks Through Power-Aware Sparse Zeroth-Order Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 7583-7591. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16928

Issue

Section

AAAI Technical Track on Machine Learning II