Layered-Parameter Perturbation for Zeroth-Order Optimization of Optical Neural Networks

Authors

  • Hiroshi Sawada NTT Corporation
  • Kazuo Aoyama NTT Corporation
  • Masaya Notomi NTT Corporation

DOI:

https://doi.org/10.1609/aaai.v39i19.34235

Abstract

Optical neural networks (ONNs) have attracted great attention due to their low power consumption and high-speed processing. When training an ONN implemented on a chip with possible fabrication variations, the well-known backpropagation algorithm cannot be executed accurately because the perfect information inside the chip cannot be observed. Instead, we employ a black-box optimization method such as zeroth-order (ZO) optimization. In this paper, we first discuss how ONN parameters should be perturbed to search for better values in a black-box manner. Conventionally, parameter perturbations are sampled from a normal distribution with an identity covariance matrix. This is plausible if the parameters are not interrelated in a module, like a linear module of an ordinary neural network. However, this is not the best way for ONN modules with layered parameters, which are interrelated by optical paths. We then propose to perturb the parameters by a normal distribution with a special covariance matrix computed by our novel method. The covariance matrix is designed so that the perturbations appearing at the module output caused by the parameter perturbations become as isotropic as possible to uniformly search for better values. Experimental results show that the proposed method using the special covariance matrix significantly outperformed conventional methods.

Published

2025-04-11

How to Cite

Sawada, H., Aoyama, K., & Notomi, M. (2025). Layered-Parameter Perturbation for Zeroth-Order Optimization of Optical Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 39(19), 20292–20301. https://doi.org/10.1609/aaai.v39i19.34235

Issue

Section

AAAI Technical Track on Machine Learning V