TRUST: Leveraging Text Robustness for Unsupervised Domain Adaptation

Mattia Litrico; Mario Valerio Giuffrida; Sebastiano Battiato; Devis Tuia

doi:10.1609/aaai.v40i28.39535

Authors

Mattia Litrico University of Catania
Mario Valerio Giuffrida University of Nottingham
Sebastiano Battiato University of Catania
Devis Tuia École polytechnique fédérale de Lausanne

DOI:

https://doi.org/10.1609/aaai.v40i28.39535

Abstract

Recent unsupervised domain adaptation (UDA) methods have shown great success in addressing classical domain shifts (e.g., synthetic-to-real), but they still suffer under complex shifts (e.g. geographical shift), where both the background and object appearances differ significantly across domains. Prior works showed that the language modality can help in the adaptation process, exhibiting more robustness to such complex shifts. In this paper, we introduce TRUST, a novel UDA approach that exploits the robustness of the language modality to guide the adaptation of a vision model. TRUST generates pseudo-labels for target samples from their captions and introduces a novel uncertainty estimation strategy that uses normalised CLIP similarity scores to estimate the uncertainty of the generated pseudo-labels. Such estimated uncertainty is then used to reweight the classification loss, mitigating the adverse effects of wrong pseudo-labels obtained from low-quality captions. To further increase the robustness of the vision model, we propose a multimodal soft-contrastive learning loss that aligns the vision and language feature spaces, by leveraging captions to guide the contrastive training of the vision model on target images. In our contrastive loss, each pair of images acts as both a positive and a negative pair and their feature representations are attracted and repulsed with a strength proportional to the similarity of their captions. This solution avoids the need for hardly determining positive and negative pairs, which is critical in the UDA setting. Our approach outperforms previous methods, setting the new state-of-the-art on classical (DomainNet) and complex (GeoNet) domain shifts. The code is available at https://github.com/MattiaLitrico/TRUST-Leveraging-Text-Robustness-for-Unsupervised-Domain-Adaptation.

TRUST: Leveraging Text Robustness for Unsupervised Domain Adaptation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information