SinBasis Networks: Matrix-Equivalent Feature Extraction for Wave-Like Optical Spectrograms

Authors

  • Yuzhou Zhu Dalian University of Technology
  • Zheng Zhang Dalian University of Technology
  • Ruyi Zhang Zhejiang Gongshang University
  • Liang Zhou Dalian University of Technology

DOI:

https://doi.org/10.1609/aaai.v40i16.38412

Abstract

Wave-like images—from attosecond streaking spectrograms to optical spectra, audio mel-spectrograms and periodic video frames—encode critical harmonic structures that elude conventional feature extractors. We propose a unified, matrix-equivalent framework that reinterprets convolution and attention as linear transforms on flattened inputs, revealing filter weights as basis vectors spanning latent feature subspaces. To infuse spectral priors we apply elementwise sin(·) mappings to each weight matrix. Embedding these transforms into CNN, ViT and Capsule architectures yields Sin-Basis Networks with heightened sensitivity to periodic motifs and built-in invariance to spatial shifts. Experiments on a diverse collection of wave-like image datasets—including 80,000 synthetic attosecond streaking spectrograms, thousands of Raman, photoluminescence and FTIR spectra, mel-spectrograms from AudioSet and cycle-pattern frames from Kinetics—demonstrate substantial gains in reconstruction accuracy, translational robustness and zero-shot cross-domain transfer. Theoretical analysis via matrix isomorphism and Mercer-kernel truncation quantifies how sinusoidal reparametrization enriches expressivity while preserving stability in data-scarce regimes. Sin-Basis Networks thus offer a lightweight, physics-informed approach to deep learning across all wave-form imaging modalities.

Published

2026-03-14

How to Cite

Zhu, Y., Zhang, Z., Zhang, R., & Zhou, L. (2026). SinBasis Networks: Matrix-Equivalent Feature Extraction for Wave-Like Optical Spectrograms. Proceedings of the AAAI Conference on Artificial Intelligence, 40(16), 14014–14021. https://doi.org/10.1609/aaai.v40i16.38412

Issue

Section

AAAI Technical Track on Computer Vision XIII