Layered-Parameter Perturbation for Zeroth-Order Optimization of Optical Neural Networks

Hiroshi Sawada; Kazuo Aoyama; Masaya Notomi

doi:10.1609/aaai.v39i19.34235

Authors

Hiroshi Sawada NTT Corporation
Kazuo Aoyama NTT Corporation
Masaya Notomi NTT Corporation

DOI:

https://doi.org/10.1609/aaai.v39i19.34235

Abstract

Optical neural networks (ONNs) have attracted great attention due to their low power consumption and high-speed processing. When training an ONN implemented on a chip with possible fabrication variations, the well-known backpropagation algorithm cannot be executed accurately because the perfect information inside the chip cannot be observed. Instead, we employ a black-box optimization method such as zeroth-order (ZO) optimization. In this paper, we first discuss how ONN parameters should be perturbed to search for better values in a black-box manner. Conventionally, parameter perturbations are sampled from a normal distribution with an identity covariance matrix. This is plausible if the parameters are not interrelated in a module, like a linear module of an ordinary neural network. However, this is not the best way for ONN modules with layered parameters, which are interrelated by optical paths. We then propose to perturb the parameters by a normal distribution with a special covariance matrix computed by our novel method. The covariance matrix is designed so that the perturbations appearing at the module output caused by the parameter perturbations become as isotropic as possible to uniformly search for better values. Experimental results show that the proposed method using the special covariance matrix significantly outperformed conventional methods.

Layered-Parameter Perturbation for Zeroth-Order Optimization of Optical Neural Networks

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information