Skip to main content Skip to main navigation menu Skip to site footer
Proceedings of the AAAI Conference on Artificial Intelligence
  • Current
  • Archives
  • About
    • About the Journal
    • Submissions
    • Privacy Statement
    • Contact
  • Login
  1. Home /
  2. Search

Search

Advanced filters
Published After
Published Before

Search Results

Found 25136 items.
  • TORA: Train Once, Realign Anytime for Offline Multi-Objective Reinforcement Learning

    Weichen Li, Waleed Mustafa, Marcio Monteiro, Puyu Wang, Marius Kloft, Sophie Fellenz
    37609-37617
    2026-03-14
  • Bolster Hallucination Detection via Prompt-Guided Data Augmentation

    Wenyun Li, Zheng Zhang, Dongmei Jiang, Xiangyuan Lan
    37618-37626
    2026-03-14
  • Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?

    Zexi Li, Xiangzhu Wang, William F. Shen, Meghdad Kurmanji, Xinchi Qiu, Dongqi Cai, Chao Wu, Nicholas D. Lane
    37627-37635
    2026-03-14
  • How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation Under the One-Time-Pad-Based Framework

    Zi Liang, Liantong Yu, Zhang Shiyu, Qingqing Ye, Haibo Hu
    37636-37644
    2026-03-14
  • Semantics-Preserving Adversarial Attacks on Event-Driven Stock Prediction Models

    Aofan Liu, Haoxuan Li, Hongjian Xing, Yuguo Yin, Zijun Li, Yiyan Qi
    37645-37653
    2026-03-14
  • SRAM: Shape-Realism Alignment Metric for No Reference 3D Shape Evaluation

    Sheng Liu, Tianyu Luan, Phani Nuney, Xuelu Feng, Junsong Yuan
    37654-37662
    2026-03-14
  • MRACL: Multi-Reward Space Guided Adaptive Curriculum Reinforcement Learning for LLMs

    Wenxuan Liu, Liangyu Huo, Yi Jing, Xiyuan Zhang, Jian Xie
    37663-37672
    2026-03-14
  • On the Alignment of Large Language Models with Global Human Opinion

    Yang Liu, Masahiro Kaneko, Chenhui Chu
    37673-37681
    2026-03-14
  • DarkBench+: An Extended Benchmark for Evaluating Dark Patterns in Large Language Models

    Yaowen Liu, Shenjia Jing, Yufei Wei, Shoumin Zhang, Jinglu Zhang, Zhen Mei, Liangliang Yue, Jiarui Wang, Peng Zhang
    37682-37691
    2026-03-14
  • Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM Alignment

    Zixuan Liu, Siavash H. Khajavi, Guangkai Jiang, Xinru Liu
    37692-37700
    2026-03-14
  • Mitigating Self-Preference by Authorship Obfuscation

    Taslim Mahbub, Shi Feng
    37701-37708
    2026-03-14
  • DETONATE – A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization

    Renjith Prasad Kaippilly Mana, Abhilekh Borah, Hasnat Md Abdullah, Chathurangi Shyalika, Gurpreet Singh, Ritvik Garimella, Rajarshi Roy, Harshul Raj Surana, Nasrin Imanpour, Suranjana Trivedy, Amit Sheth, Amitava Das
    37709-37718
    2026-03-14
  • Misalignment from Treating Means as Ends

    Henrik Marklund, Alex Infanger, Benjamin Van Roy
    37719-37727
    2026-03-14
  • STACK: Adversarial Attacks on LLM Safeguard Pipelines

    Ian R. McKenzie, Oskar John Hollinsworth, Tom Tseng, Xander Davies, Stephen Casper, Aaron David Tucker, Robert Kirk, Adam Gleave
    37728-37737
    2026-03-14
  • Aligning Machiavellian Agents: Behavior Steering via Test-Time Policy Shaping

    Dena Mujtaba, Brian Hu, Anthony Hoogs, Arslan Basharat
    37738-37746
    2026-03-14
  • SharedRep-RLHF: A Shared Representation Approach to RLHF with Diverse Preferences

    Arpan Mukherjee, Marcello Bullo, Deniz Gündüz
    37747-37755
    2026-03-14
  • Quiet Feature Learning in Algorithmic Tasks

    Prudhviraj Naidu, Zixian Wang, Leon Bergen, Ramamohan Paturi
    37756-37764
    2026-03-14
  • A Tale of Two Identities: An Ethical Audit of AI-Crafted Synthetic Personas

    Pranav Narayanan Venkit, Jiayi Li, Yingfan Zhou, Sarah Rajtmajer, Shomir Wilson
    37765-37774
    2026-03-14
  • Intrinsic Barriers and Practical Pathways for Human–AI Alignment: An Agreement-Based Complexity Analysis

    Aran Nayebi
    37775-37782
    2026-03-14
  • CTPD: Cross Tokenizer Preference Distillation

    Truong Nguyen, Phi Van Dat, Ngan Nguyen, Linh Ngo Van, Trung Le, Thanh Hong Nguyen
    37783-37790
    2026-03-14
  • Realist and Pluralist Conceptions of Intelligence and Their Implications on AI Research

    Ninell Oldenburg, Ruchira Dhar, Anders Søgaard
    37791-37801
    2026-03-14
  • LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models

    Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Tri Nguyen, Vasudev Lal, Joseph Campbell, Simon Stepputtis, Shao-Yen Tseng
    37802-37809
    2026-03-14
  • Refine and Align: Confidence Calibration Through Multi-Agent Interaction in VQA

    Ayush Pandey, Jai Bardhan, Ishita Jain, Ramya S Hebbalaguppe, Rohan Raju Dhanakshirur, Lovekesh Vig
    37810-37819
    2026-03-14
  • AdvBDGen: A Robust Framework for Generating Adaptive and Stealthy Backdoors in LLM Alignment

    Pankayaraj Pathmanathan, Udari Madhushani Sehwag, Michael-Andrei Panaitescu-Liess, Cho-Yu Jason Chiang, Furong Huang
    37820-37829
    2026-03-14
  • Beyond I’m Sorry, I Can’t: Dissecting Large-Language-Model Refusal

    Nirmalendu Prakash, Yeo Wei Jie, Amir Abdullah, Ranjan Satapathy, Erik Cambria, Roy Ka-Wei Lee
    37830-37838
    2026-03-14
21101 - 21125 of 25136 items << < 840 841 842 843 844 845 846 847 848 849 > >> 

Information

  • For Readers
  • For Authors
  • For Librarians
  • Part of the
    PKP Publishing Services Network

Copyright © 2024, Association for the Advancement of Artificial Intelligence

More information about the publishing system, Platform and Workflow by OJS/PKP.