Skip to main content
Skip to main navigation menu
Skip to site footer
Open Menu
Proceedings of the AAAI Conference on Artificial Intelligence
Current
Archives
About
About the Journal
Submissions
Privacy Statement
Contact
Login
Home
/
Search
Search
Search articles for
Advanced filters
Published After
Year
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
Month
January
February
March
April
May
June
July
August
September
October
November
December
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Published Before
Year
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
Month
January
February
March
April
May
June
July
August
September
October
November
December
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
By Author
Search
Search Results
Found 25136 items.
Towards Benchmarking Privacy Vulnerabilities in Selective Forgetting with Large Language Models
Wei Qian, Chenxu Zhao, Yangyi Li, Mengdi Huai
37839-37848
2026-03-14
Backdoor Attacks on Open Vocabulary Object Detectors via Multi-Modal Prompt Tuning
Ankita Raj, Chetan Arora
37849-37857
2026-03-14
Chain-of-Thought Driven Adversarial Scenario Extrapolation for Robust Language Models
Md Rafi Ur Rashid, Vishnu Asutosh Dasu, Ye Wang, Gang Tan, Shagufta Mehnaz
37858-37866
2026-03-14
FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research
Gabriel Recchia, Chatrik Singh Mangat, Issac Li, Gayatri Krishnakumar
37867-37876
2026-03-14
Confirmation Bias: A Challenge for Scalable Oversight
Gabriel Recchia, Chatrik Singh Mangat, Jinu Nyachhyon, Mridul Sharma, Callum Canavan, Dylan Epstein-Gross, Muhammed Abdulbari
37877-37886
2026-03-14
Mind the Gap: Quantifying and Aligning Human-AI Visual Attention for Accident Anticipation
Hoe Sung Ryu, Christian Wallraven
37887-37895
2026-03-14
Polarity-Aware Probing for Quantifying Latent Alignment in Language Models
Sabrina Sadiekh, Elena Ericheva, Chirag Agarwal
37896-37903
2026-03-14
Detecting Compute Structuring in AI Governance Is Likely Feasible
Emmanouil Seferis, Timothy Fist
37904-37912
2026-03-14
Tight Robustness Certification Through the Convex Hull of ℓ₀ Attacks
Yuval Shapira, Dana Drachsler-Cohen
37913-37922
2026-03-14
EASE: Practical and Efficient Safety Alignment for Small Language Models
Haonan Shi, Guoli Wang, Tu Ouyang, An Wang
37923-37931
2026-03-14
Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training
Jianfeng Si, Lin Sun, Zhewen Tan, Xiangzheng Zhang
37932-37940
2026-03-14
Beyond Verdicts: Evaluating Language Model Moral Competence
Aaron J Snoswell, Daniel Kilov, Seth Lazar
37941-37950
2026-03-14
SMPRO: Self-Supervised Visual Preference Alignment via Differentiable Multi-Preference Multi-Group Ranking
Sirnam Swetha, Rui Meng, Shwetha Ram, Tal Neiman, Son Tran, Mubarak Shah
37951-37960
2026-03-14
Persistent Instability in LLM’s Personality Measurements: Effects of Scale, Reasoning, and Conversation History
Tommaso Tosato, Saskia Helbling, Yorguin-Jose Mantilla-Ramos, Mahmood Hegazy, Alberto Tosato, David John Lemay, Irina Rish, Guillaume Dumas
37961-37969
2026-03-14
Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems
Xiaoqing Wang, Keman Huang, Bin Liang, Hongyu Li, Xiaoyong Du
37970-37978
2026-03-14
Benchmarking Trustworthiness in Multimodal LLMs for Video Understanding
Youze Wang, Zijun Chen, Ruoyu Chen, Shishen Gu, Wenbo Hu, Jiayang Liu, Yinpeng Dong, Hang Su, Jun Zhu, Meng Wang, Richang Hong
37979-37987
2026-03-14
STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
Zijun Wang, Haoqin Tu, Yuhan Wang, Juncheng Wu, Yanqing Liu, Jieru Mei, Brian R. Bartoldson, Bhavya Kailkhura, Cihang Xie
37988-37997
2026-03-14
CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing
Zixia Wang, Gaojie Jin, Jia Hu, Ronghui Mu
37998-38006
2026-03-14
Safe Multi-agent Reinforcement Learning with Natural Language Constraints
Ziyan Wang, Meng Fang, Tristan Tomilin, Fei Fang, Yali Du
38007-38015
2026-03-14
Designing Incident Reporting Systems for Harms from General-Purpose AI
Kevin Wei, Lennart Heim
38016-38029
2026-03-14
HumorReject: Decoupling LLM Safety from Refusal Prefix via a Little Humor
Zihui Wu, Haichang Gao, Jiacheng Luo, Zhaoxiang Liu
38030-38038
2026-03-14
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
Zonglin Wu, Yule Xue, Yaoyao Feng, Xiaolong Wang, Yiren Song
38039-38047
2026-03-14
MedAtlas: Evaluating LLMs for Multi-Round, Multi-Task Medical Reasoning Across Diverse Imaging Modalities and Clinical Text
Ronghao Xu, Zhen Huang, Yangbo Wei, Xiaoqian Zhou, Zikang Xu, Ting Liu, Zihang Jiang, S. Kevin Zhou
38048-38056
2026-03-14
When Human Preferences Flip: An Instance-Dependent Robust Loss for RLHF
Yifan Xu, Xichen Ye, Yifan Chen, Qiaosheng Zhang
38057-38065
2026-03-14
Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models
Yijun Yang, Lichao Wang, Jianping Zhang, Chi Harold Liu, Lanqing Hong, Qiang Xu
38066-38074
2026-03-14
21126 - 21150 of 25136 items
<<
<
841
842
843
844
845
846
847
848
849
850
>
>>
Information
For Readers
For Authors
For Librarians
Part of the
PKP Publishing Services Network