Large Scale Multilingual Sticker Recommendation In Messaging Apps
Stickers are popularly used while messaging to visually express nuanced thoughts. We describe a real-time sticker recommendation (SR) system. We decompose SR into two steps: predict the message that is likely to be sent, and substitute that message with an appropriate sticker. To address the challenges caused by transliteration of message from users’ native language to the Roman script, we learn message embeddings by employing character-level CNN in an unsupervised manner. We use them to cluster semantically similar messages. Next, we predict the message cluster instead of the message. Except for validation, our system does not require human labeled data, leading to a fully auto-matic tuning pipeline. We propose a hybrid message prediction model, which can easily run on low-end phones. We discuss message cluster to sticker mapping, addressing the multilingual needs of our users, automated tuning of the system and also propose a novel application of community detection algorithm. As of November 2020, our system contains 100k+ stickers, has been deployed for 15+ months, and is being used by millions of users.