Skip to main content
Skip to main navigation menu
Skip to site footer
Open Menu
Proceedings of the AAAI Conference on Artificial Intelligence
Current
Archives
About
About the Journal
Submissions
Privacy Statement
Contact
Login
Home
/
Search
Search
Search articles for
Advanced filters
Published After
Year
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
Month
January
February
March
April
May
June
July
August
September
October
November
December
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Published Before
Year
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
Month
January
February
March
April
May
June
July
August
September
October
November
December
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
By Author
Search
Search Results
Found 25136 items.
TORA: Train Once, Realign Anytime for Offline Multi-Objective Reinforcement Learning
Weichen Li, Waleed Mustafa, Marcio Monteiro, Puyu Wang, Marius Kloft, Sophie Fellenz
37609-37617
2026-03-14
Bolster Hallucination Detection via Prompt-Guided Data Augmentation
Wenyun Li, Zheng Zhang, Dongmei Jiang, Xiangyuan Lan
37618-37626
2026-03-14
Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?
Zexi Li, Xiangzhu Wang, William F. Shen, Meghdad Kurmanji, Xinchi Qiu, Dongqi Cai, Chao Wu, Nicholas D. Lane
37627-37635
2026-03-14
How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation Under the One-Time-Pad-Based Framework
Zi Liang, Liantong Yu, Zhang Shiyu, Qingqing Ye, Haibo Hu
37636-37644
2026-03-14
Semantics-Preserving Adversarial Attacks on Event-Driven Stock Prediction Models
Aofan Liu, Haoxuan Li, Hongjian Xing, Yuguo Yin, Zijun Li, Yiyan Qi
37645-37653
2026-03-14
SRAM: Shape-Realism Alignment Metric for No Reference 3D Shape Evaluation
Sheng Liu, Tianyu Luan, Phani Nuney, Xuelu Feng, Junsong Yuan
37654-37662
2026-03-14
MRACL: Multi-Reward Space Guided Adaptive Curriculum Reinforcement Learning for LLMs
Wenxuan Liu, Liangyu Huo, Yi Jing, Xiyuan Zhang, Jian Xie
37663-37672
2026-03-14
On the Alignment of Large Language Models with Global Human Opinion
Yang Liu, Masahiro Kaneko, Chenhui Chu
37673-37681
2026-03-14
DarkBench+: An Extended Benchmark for Evaluating Dark Patterns in Large Language Models
Yaowen Liu, Shenjia Jing, Yufei Wei, Shoumin Zhang, Jinglu Zhang, Zhen Mei, Liangliang Yue, Jiarui Wang, Peng Zhang
37682-37691
2026-03-14
Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM Alignment
Zixuan Liu, Siavash H. Khajavi, Guangkai Jiang, Xinru Liu
37692-37700
2026-03-14
Mitigating Self-Preference by Authorship Obfuscation
Taslim Mahbub, Shi Feng
37701-37708
2026-03-14
DETONATE – A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization
Renjith Prasad Kaippilly Mana, Abhilekh Borah, Hasnat Md Abdullah, Chathurangi Shyalika, Gurpreet Singh, Ritvik Garimella, Rajarshi Roy, Harshul Raj Surana, Nasrin Imanpour, Suranjana Trivedy, Amit Sheth, Amitava Das
37709-37718
2026-03-14
Misalignment from Treating Means as Ends
Henrik Marklund, Alex Infanger, Benjamin Van Roy
37719-37727
2026-03-14
STACK: Adversarial Attacks on LLM Safeguard Pipelines
Ian R. McKenzie, Oskar John Hollinsworth, Tom Tseng, Xander Davies, Stephen Casper, Aaron David Tucker, Robert Kirk, Adam Gleave
37728-37737
2026-03-14
Aligning Machiavellian Agents: Behavior Steering via Test-Time Policy Shaping
Dena Mujtaba, Brian Hu, Anthony Hoogs, Arslan Basharat
37738-37746
2026-03-14
SharedRep-RLHF: A Shared Representation Approach to RLHF with Diverse Preferences
Arpan Mukherjee, Marcello Bullo, Deniz Gündüz
37747-37755
2026-03-14
Quiet Feature Learning in Algorithmic Tasks
Prudhviraj Naidu, Zixian Wang, Leon Bergen, Ramamohan Paturi
37756-37764
2026-03-14
A Tale of Two Identities: An Ethical Audit of AI-Crafted Synthetic Personas
Pranav Narayanan Venkit, Jiayi Li, Yingfan Zhou, Sarah Rajtmajer, Shomir Wilson
37765-37774
2026-03-14
Intrinsic Barriers and Practical Pathways for Human–AI Alignment: An Agreement-Based Complexity Analysis
Aran Nayebi
37775-37782
2026-03-14
CTPD: Cross Tokenizer Preference Distillation
Truong Nguyen, Phi Van Dat, Ngan Nguyen, Linh Ngo Van, Trung Le, Thanh Hong Nguyen
37783-37790
2026-03-14
Realist and Pluralist Conceptions of Intelligence and Their Implications on AI Research
Ninell Oldenburg, Ruchira Dhar, Anders Søgaard
37791-37801
2026-03-14
LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models
Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Tri Nguyen, Vasudev Lal, Joseph Campbell, Simon Stepputtis, Shao-Yen Tseng
37802-37809
2026-03-14
Refine and Align: Confidence Calibration Through Multi-Agent Interaction in VQA
Ayush Pandey, Jai Bardhan, Ishita Jain, Ramya S Hebbalaguppe, Rohan Raju Dhanakshirur, Lovekesh Vig
37810-37819
2026-03-14
AdvBDGen: A Robust Framework for Generating Adaptive and Stealthy Backdoors in LLM Alignment
Pankayaraj Pathmanathan, Udari Madhushani Sehwag, Michael-Andrei Panaitescu-Liess, Cho-Yu Jason Chiang, Furong Huang
37820-37829
2026-03-14
Beyond I’m Sorry, I Can’t: Dissecting Large-Language-Model Refusal
Nirmalendu Prakash, Yeo Wei Jie, Amir Abdullah, Ranjan Satapathy, Erik Cambria, Roy Ka-Wei Lee
37830-37838
2026-03-14
21101 - 21125 of 25136 items
<<
<
840
841
842
843
844
845
846
847
848
849
>
>>
Information
For Readers
For Authors
For Librarians
Part of the
PKP Publishing Services Network