NADIR: Differential Attention Flow for Non-Autoregressive Transliteration in Indic Languages

Authors

  • Lakshya Tomar RocketFrog AI
  • Vinayak Abrol Indraprastha Institute of Information Technology, Delhi
  • Puneet Agarwal RocketFrog AI

DOI:

https://doi.org/10.1609/aaai.v40i31.39796

Abstract

In this work, we argue that not all sequence-to-sequence tasks require the strong inductive biases of autoregressive (AR) models. Tasks like multilingual transliteration, code refactoring, grammatical correction or text normalization often rely on local dependencies where the full modeling capacity of AR models can be overkill, creating a trade-off between their high accuracy and high inference latency. While non-autoregressive (NAR) models offer speed, they typically suffer from hallucinations and poor length control. To explore this trade-off, we focus on the multilingual transliteration task in Indic languages and introduce NADIR, a novel NAR architecture designed to strike a balance between speed and accuracy. NADIR integrates a Differential Transformer and a Mixture-of-Experts mechanism, enabling it to robustly model complex character mappings without sequential dependencies. NADIR achieves over a 13× speed-up compared to the state-of-the-art AR baseline. It maintains a competitive mean Character Error Rate of 15.78%, compared to 14.44% for the AR model and 21.88% for a standard NAR equivalent. Importantly, NADIR reduces Repetition errors by 49.53%, Substitution errors by 24.45%, Omission errors by 32.92%, and Insertion errors by 16.87%. This work provides a practical blueprint for building fast and reliable NAR systems, effectively bridging the gap between AR accuracy and the demands of real-time, large-scale deployment.

Published

2026-03-14

How to Cite

Tomar, L., Abrol, V., & Agarwal, P. (2026). NADIR: Differential Attention Flow for Non-Autoregressive Transliteration in Indic Languages. Proceedings of the AAAI Conference on Artificial Intelligence, 40(31), 25958–25965. https://doi.org/10.1609/aaai.v40i31.39796

Issue

Section

AAAI Technical Track on Machine Learning VIII