SNS-Grasp: Semantic-guided Noise Scaling for Grasp Generation
DOI:
https://doi.org/10.1609/aaai.v40i11.37909Abstract
While diffusion models show promise for intent-based grasp generation, their isotropic noise schedules struggle with joint-specific sensitivity and task-aware variability. This limitation leads to grasps with suboptimal semantic alignment or physical feasibility. To address this challenge, we propose Semantic-guided Noise Scaling for grasp generation (SNS-Grasp), a novel framework that integrates two key innovations. First, the Semantic-guided Noise Scaling Diffusion (SNS-Diff) module generates intent-aware grasps by replacing isotropic noise with anisotropic modulation, dynamically adapting to task semantics and joint-specific sensitivity. Specifically, SNS-Diff leverages a pretrained Intent Recognizer to extract task-aware confidence scores and joint-specific gradient sensitivities from the interaction context. These signals adjust the noise scaling during denoising, downweighting perturbations for semantically critical joints to ensure semantic alignment. Second, the Fine-grained Grasp Refinement (FGR) module establishes dynamic joint-vertex coupling through fine-grained hand-object spatial relationships, enabling iterative optimization of physically executable grasps. Extensive experiments on OakInk and GRAB demonstrate SNS-Grasp's superior performance in semantic accuracy and physical feasibility, with robust generalization to unseen objects.Published
2026-03-14
How to Cite
Tang, Z., Zheng, Y., Zhong, Y., Li, H., Hao, Y., & Pun, C.-M. (2026). SNS-Grasp: Semantic-guided Noise Scaling for Grasp Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(11), 9484–9492. https://doi.org/10.1609/aaai.v40i11.37909
Issue
Section
AAAI Technical Track on Computer Vision VIII