PESTO: Switching Point Based Dynamic and Relative Positional Encoding for Code-Mixed Languages (Student Abstract)

Mohsin Ali; Sai Teja Kandukuri; Sumanth Manduru; Parth Patwa; Amitava Das

doi:10.1609/aaai.v36i11.21587

PESTO: Switching Point Based Dynamic and Relative Positional Encoding for Code-Mixed Languages (Student Abstract)

Authors

Mohsin Ali IIIT Sri City, India
Sai Teja Kandukuri IIIT Sri City, India
Sumanth Manduru IIIT Sri City, India
Parth Patwa UCLA, USA
Amitava Das Wipro AI Labs, India AI Institute, University of South Carolina, USA

DOI:

https://doi.org/10.1609/aaai.v36i11.21587

Keywords:

Word Embeddings, Code Mixing, Sentiment Analysis, Social Media

Abstract

NLP applications for code-mixed (CM) or mix-lingual text have gained a significant momentum recently, the main reason being the prevalence of language mixing in social media communications in multi-lingual societies like India, Mexico, Europe, parts of USA etc. Word embeddings are basic building blocks of any NLP system today, yet, word embedding for CM languages is an unexplored territory. The major bottleneck for CM word embeddings is switching points, where the language switches. These locations lack in contextually and statistical systems fail to model this phenomena due to high variance in the seen examples. In this paper we present our initial observations on applying switching point based positional encoding techniques for CM language, specifically Hinglish (Hindi - English). Results are only marginally better than SOTA, but it is evident that positional encoding could be an effective way to train position sensitive language models for CM text.

Downloads

Published

2022-06-28

How to Cite

Ali, M., Kandukuri, S. T., Manduru, S., Patwa, P., & Das, A. (2022). PESTO: Switching Point Based Dynamic and Relative Positional Encoding for Code-Mixed Languages (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 36(11), 12901-12902. https://doi.org/10.1609/aaai.v36i11.21587

Download Citation

Issue

Vol. 36 No. 11: IAAI-22, EAAI-22, AAAI-22 Special Programs and Special Track, Student Papers and Demonstrations

Section

AAAI Student Abstract and Poster Program

PESTO: Switching Point Based Dynamic and Relative Positional Encoding for Code-Mixed Languages (Student Abstract)

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription