Skip to main content
Skip to main navigation menu
Skip to site footer
Open Menu
Proceedings of the AAAI Conference on Artificial Intelligence
Current
Archives
About
About the Journal
Submissions
Privacy Statement
Contact
Login
Home
/
Search
Search
Search articles for
Advanced filters
Published After
Year
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
Month
January
February
March
April
May
June
July
August
September
October
November
December
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Published Before
Year
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
Month
January
February
March
April
May
June
July
August
September
October
November
December
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
By Author
Search
Search Results
Found 25136 items.
The Alignment Game: A Theory of Long-Horizon Alignment Through Recursive Curation
Ali Falahati, Mohammad Mohammadi Amiri, Kate Larson, Lukasz Golab
37379-37386
2026-03-14
SMiLE: Provably Enforcing Global Relational Properties in Neural Networks
Matteo Francobaldi, Michele Lombardi, Andrea Lodi
37387-37395
2026-03-14
EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing
Fan Gao, Dongyuan Li, Ding Xia, Fei Mi, Yasheng Wang, Lifeng Shang, Baojun Wang
37396-37406
2026-03-14
Beyond Transcription: Mechanistic Interpretability in ASR
Neta Glazer, Yael Segal-Feldman, Hilit Segev, Aviv Shamsian, Asaf Buchnick, Gill Hetz, Ethan Fetaya, Joseph Keshet, Aviv Navon
37407-37416
2026-03-14
AlignTree: Efficient Defense Against LLM Jailbreak Attacks
Gil Goren, Shahar Katz, Lior Wolf
37417-37425
2026-03-14
Identifying Features Associated with Bias Against 93 Stigmatized Groups in Language Models and Guardrail Model Safety Mitigation
Anna-Maria Gueorguieva, Aylin Caliskan
37426-37434
2026-03-14
Resolving Predictive Multiplicity for the Rashomon Set
Parian Haghighat, Hadis Anahideh, Cynthia Rudin
37435-37442
2026-03-14
Unintended Misalignment from Agentic Fine-Tuning: Risks and Mitigation
Dongyoon Hahm, Taywon Min, Woogyeol Jin, Kimin Lee
37443-37451
2026-03-14
Silenced Biases: The Dark Side LLMs Learned to Refuse
Rom Himelstein, Amit LeVi, Brit Youngmann, Yaniv Nemcovsky, Avi Mendelson
37452-37461
2026-03-14
TAPO: Dynamic Teacher and Perturbed Answer Injection for Policy Optimization
Maowei Jiang, Zihang Wang, Qi Wang, Peter Búš, Moquan Cheng, Yifan Wang, Quangao Liu, Ruiqi Li, Pengyu Zeng, Ruikai Liu, Alan Liang, Yansong Xu, Yusong Hu, Chaoran Zhang, Zhiyong Dong
37462-37471
2026-03-14
Uncovering and Aligning Anomalous Attention Heads to Defend Against NLP Backdoor Attacks
Haotian Jin, Yang Li, Haihui Fan, Lin Shen, Xiangfang Li, Bo Li
37472-37480
2026-03-14
Requirements for Aligned, Dynamic Resolution of Conflicts in Operational Constraints
Steven J. Jones, Robert E. Wray, John E. Laird
37481-37490
2026-03-14
Benchmarking XAI Explanations with Human-Aligned Evaluations
Rémi Kazmierczak, Steve Azzolin, Eloïse Berthier, Anna Hedström, Patricia Delhomme, David Filliat, Nicolas Bousquet, Goran Frehse, Massimiliano Mancini, Baptiste Caramiaux, Andrea Passerini, Gianni Franchi
37491-37500
2026-03-14
Moral Change or Noise? On Problems of Aligning AI with Temporally Unstable Human Feedback
Vijay Keswani, Cyrus Cousins, Breanna Nguyen, Vincent Conitzer, Hoda Heidari, Jana Schaich Borg, Walter Sinnott-Armstrong
37501-37509
2026-03-14
Transparent Networks for Multivariate Time Series
Minkyu Kim, Suan Lee, Jinho Kim
37510-37518
2026-03-14
Align to Structure: Aligning Large Language Models with Structural Information
Zae Myung Kim, Anand Ramachandran, Farideh Tavazoee, Joo-Kyung Kim, Oleg Rokhlenko, Dongyeop Kang
37519-37528
2026-03-14
Beautiful Images, Toxic Words: Understanding and Addressing Offensive Text in Generated Images
Aditya Kumar, Tom Blanchard, Adam Dziedzic, Franziska Boenisch
37529-37537
2026-03-14
Cost-Minimized Label-Flipping Poisoning Attack to LLM Alignment
Shigeki Kusaka, Keita Saito, Mikoto Kudo, Takumi Tanabe, Akifumi Wachi, Youhei Akimoto
37538-37546
2026-03-14
Dropouts in Confidence: Moral Uncertainty in Human-LLM Alignment
Jea Kwon, Luiz Felipe Vecchietti, Sungwon Park, Meeyoung Cha
37547-37555
2026-03-14
Selective Weak-to-Strong Generalization
Hao Lang, Fei Huang, Yongbin Li
37556-37564
2026-03-14
MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Juyong Lee, Dongyoon Hahm, June Suk Choi, W. Bradley Knox, Kimin Lee
37565-37573
2026-03-14
STELAR-VISION: Self-Topology-Aware Efficient Learning for Aligned Reasoning in Vision
Chen Li, Han Zhang, Zhantao Yang, Fangyi Chen, Zihan Wang, Anudeepsekhar Bolimera, Marios Savvides
37574-37582
2026-03-14
ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution
He Li, Xiaojun Chen, Jingcheng He, Zhendong Zhao, Shuguang Yuan, Xin Zhao, Yunfei Yang
37583-37590
2026-03-14
StyleBreak: Revealing Alignment Vulnerabilities in Large Audio-Language Models via Style-Aware Audio Jailbreak
Hongyi Li, Chengxuan Zhou, Chu Wang, Sicheng Liang, Yanting Chen, Qinlin Xie, Jiawei Ye, Jie Wu
37591-37599
2026-03-14
How Bias Binds: Measuring Hidden Associations for Bias Control in Text-to-Image Compositions
Jeng-Lin Li, Ming-Ching Chang, Wei-Chao Chen
37600-37608
2026-03-14
21076 - 21100 of 25136 items
<<
<
839
840
841
842
843
844
845
846
847
848
>
>>
Information
For Readers
For Authors
For Librarians
Part of the
PKP Publishing Services Network