Learning to LEAP: Efficient Dense Point Tracking by Focusing Where It Matters

Chenzhi Zhao; Wufan Wang; Bo Zhang; Wendong Wang

doi:10.1609/aaai.v40i15.38311

Authors

Chenzhi Zhao Beijing University of Posts and Telecommunications
Wufan Wang Beijing University of Posts and Telecommunications
Bo Zhang Beijing University of Posts and Telecommunications
Wendong Wang Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v40i15.38311

Abstract

Tracking Any Point (TAP) is a foundational task in computer vision with broad applicability. The state-of-the-art self-supervised TAP method leverages a global matching transformer and contrastive random walks to learn point correspondences. However, its dense all-pairs attention and correlation volume computation tend to introduce irrelevant features and produce less informative training signals, degrading both learning efficiency and tracking accuracy. To address these limitations, we introduce LEAP-Track, a self-supervised TAP approach that computes the attention matrices and correlation volume over adaptively selected sparse pairs. It consists of two core designs: (1) Curriculum-based Sparse Attention (CSA), which dynamically focuses on the most relevant keys, promoting the learning of discriminative features; and (2) Progressive k-NN Transition (PkT), which reformulates the contrastive random walk to operate on an increasingly sparse k-NN affinity graph to reinforce the learning of the most informative correspondences. By integrating the above two designs into a two-stage training paradigm, LEAP-Track is shown both theoretically and empirically to effectively boost learning efficiency, achieving superior tracking accuracy over existing self-supervised TAP methods.

Learning to LEAP: Efficient Dense Point Tracking by Focusing Where It Matters

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information