Transformer-Based Selective Super-resolution for Efficient Image Refinement

Tianyi Zhang; Kishore Kasichainula; Yaoxin Zhuo; Baoxin Li; Jae-Sun Seo; Yu Cao

doi:10.1609/aaai.v38i7.28560

Authors

Tianyi Zhang University of Minnesota
Kishore Kasichainula Arizona State University
Yaoxin Zhuo Arizona State University
Baoxin Li Arizona State University
Jae-Sun Seo Cornell Tech
Yu Cao University of Minnesota

DOI:

https://doi.org/10.1609/aaai.v38i7.28560

Keywords:

CV: Other Foundations of Computer Vision, CV: Representation Learning for Vision, CV: Computational Photography, Image & Video Synthesis, ML: Deep Generative Models & Autoencoders

Abstract

Conventional super-resolution methods suffer from two drawbacks: substantial computational cost in upscaling an entire large image, and the introduction of extraneous or potentially detrimental information for downstream computer vision tasks during the refinement of the background. To solve these issues, we propose a novel transformer-based algorithm, Selective Super-Resolution (SSR), which partitions images into non-overlapping tiles, selects tiles of interest at various scales with a pyramid architecture, and exclusively reconstructs these selected tiles with deep features. Experimental results on three datasets demonstrate the efficiency and robust performance of our approach for super-resolution. Compared to the state-of-the-art methods, the FID score is reduced from 26.78 to 10.41 with 40% reduction in computation cost for the BDD100K dataset.

Transformer-Based Selective Super-resolution for Efficient Image Refinement

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information