Skip to main content Skip to main navigation menu Skip to site footer
Proceedings of the AAAI Conference on Artificial Intelligence
  • Current
  • Archives
  • About
    • About the Journal
    • Submissions
    • Privacy Statement
    • Contact
  • Login
  1. Home /
  2. Search

Search

Advanced filters
Published After
Published Before

Search Results

Found 25136 items.
  • Towards Benchmarking Privacy Vulnerabilities in Selective Forgetting with Large Language Models

    Wei Qian, Chenxu Zhao, Yangyi Li, Mengdi Huai
    37839-37848
    2026-03-14
  • Backdoor Attacks on Open Vocabulary Object Detectors via Multi-Modal Prompt Tuning

    Ankita Raj, Chetan Arora
    37849-37857
    2026-03-14
  • Chain-of-Thought Driven Adversarial Scenario Extrapolation for Robust Language Models

    Md Rafi Ur Rashid, Vishnu Asutosh Dasu, Ye Wang, Gang Tan, Shagufta Mehnaz
    37858-37866
    2026-03-14
  • FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research

    Gabriel Recchia, Chatrik Singh Mangat, Issac Li, Gayatri Krishnakumar
    37867-37876
    2026-03-14
  • Confirmation Bias: A Challenge for Scalable Oversight

    Gabriel Recchia, Chatrik Singh Mangat, Jinu Nyachhyon, Mridul Sharma, Callum Canavan, Dylan Epstein-Gross, Muhammed Abdulbari
    37877-37886
    2026-03-14
  • Mind the Gap: Quantifying and Aligning Human-AI Visual Attention for Accident Anticipation

    Hoe Sung Ryu, Christian Wallraven
    37887-37895
    2026-03-14
  • Polarity-Aware Probing for Quantifying Latent Alignment in Language Models

    Sabrina Sadiekh, Elena Ericheva, Chirag Agarwal
    37896-37903
    2026-03-14
  • Detecting Compute Structuring in AI Governance Is Likely Feasible

    Emmanouil Seferis, Timothy Fist
    37904-37912
    2026-03-14
  • Tight Robustness Certification Through the Convex Hull of ℓ₀ Attacks

    Yuval Shapira, Dana Drachsler-Cohen
    37913-37922
    2026-03-14
  • EASE: Practical and Efficient Safety Alignment for Small Language Models

    Haonan Shi, Guoli Wang, Tu Ouyang, An Wang
    37923-37931
    2026-03-14
  • Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training

    Jianfeng Si, Lin Sun, Zhewen Tan, Xiangzheng Zhang
    37932-37940
    2026-03-14
  • Beyond Verdicts: Evaluating Language Model Moral Competence

    Aaron J Snoswell, Daniel Kilov, Seth Lazar
    37941-37950
    2026-03-14
  • SMPRO: Self-Supervised Visual Preference Alignment via Differentiable Multi-Preference Multi-Group Ranking

    Sirnam Swetha, Rui Meng, Shwetha Ram, Tal Neiman, Son Tran, Mubarak Shah
    37951-37960
    2026-03-14
  • Persistent Instability in LLM’s Personality Measurements: Effects of Scale, Reasoning, and Conversation History

    Tommaso Tosato, Saskia Helbling, Yorguin-Jose Mantilla-Ramos, Mahmood Hegazy, Alberto Tosato, David John Lemay, Irina Rish, Guillaume Dumas
    37961-37969
    2026-03-14
  • Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems

    Xiaoqing Wang, Keman Huang, Bin Liang, Hongyu Li, Xiaoyong Du
    37970-37978
    2026-03-14
  • Benchmarking Trustworthiness in Multimodal LLMs for Video Understanding

    Youze Wang, Zijun Chen, Ruoyu Chen, Shishen Gu, Wenbo Hu, Jiayang Liu, Yinpeng Dong, Hang Su, Jun Zhu, Meng Wang, Richang Hong
    37979-37987
    2026-03-14
  • STAR-1: Safer Alignment of Reasoning LLMs with 1K Data

    Zijun Wang, Haoqin Tu, Yuhan Wang, Juncheng Wu, Yanqing Liu, Jieru Mei, Brian R. Bartoldson, Bhavya Kailkhura, Cihang Xie
    37988-37997
    2026-03-14
  • CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing

    Zixia Wang, Gaojie Jin, Jia Hu, Ronghui Mu
    37998-38006
    2026-03-14
  • Safe Multi-agent Reinforcement Learning with Natural Language Constraints

    Ziyan Wang, Meng Fang, Tristan Tomilin, Fei Fang, Yali Du
    38007-38015
    2026-03-14
  • Designing Incident Reporting Systems for Harms from General-Purpose AI

    Kevin Wei, Lennart Heim
    38016-38029
    2026-03-14
  • HumorReject: Decoupling LLM Safety from Refusal Prefix via a Little Humor

    Zihui Wu, Haichang Gao, Jiacheng Luo, Zhaoxiang Liu
    38030-38038
    2026-03-14
  • MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks

    Zonglin Wu, Yule Xue, Yaoyao Feng, Xiaolong Wang, Yiren Song
    38039-38047
    2026-03-14
  • MedAtlas: Evaluating LLMs for Multi-Round, Multi-Task Medical Reasoning Across Diverse Imaging Modalities and Clinical Text

    Ronghao Xu, Zhen Huang, Yangbo Wei, Xiaoqian Zhou, Zikang Xu, Ting Liu, Zihang Jiang, S. Kevin Zhou
    38048-38056
    2026-03-14
  • When Human Preferences Flip: An Instance-Dependent Robust Loss for RLHF

    Yifan Xu, Xichen Ye, Yifan Chen, Qiaosheng Zhang
    38057-38065
    2026-03-14
  • Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models

    Yijun Yang, Lichao Wang, Jianping Zhang, Chi Harold Liu, Lanqing Hong, Qiang Xu
    38066-38074
    2026-03-14
21126 - 21150 of 25136 items << < 841 842 843 844 845 846 847 848 849 850 > >> 

Information

  • For Readers
  • For Authors
  • For Librarians
  • Part of the
    PKP Publishing Services Network

Copyright © 2024, Association for the Advancement of Artificial Intelligence

More information about the publishing system, Platform and Workflow by OJS/PKP.