Skip to main content Skip to main navigation menu Skip to site footer
Proceedings of the AAAI Conference on Artificial Intelligence
  • Current
  • Archives
  • About
    • About the Journal
    • Submissions
    • Privacy Statement
    • Contact
  • Login
  1. Home /
  2. Search

Search

Advanced filters
Published After
Published Before

Search Results

Found 25136 items.
  • SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

    Somnath Banerjee, Sayan Layek, Soham Tripathy, Shanu Kumar, Animesh Mukherjee, Rima Hazra
    27188-27196
    2025-04-11
  • Bridging the Knowledge Gap: Understanding User Expectations for Trustworthy LLM Standards

    Michaela Benk, Léane Wettstein, Nadine Schlicker, Florian von Wangenheim, Nicolas Scharowski
    27197-27205
    2025-04-11
  • Scaling Trends for Data Poisoning in LLMs

    Dillon Bowen, Brendan Murphy, Will Cai, David Khachaturov, Adam Gleave, Kellin Pelrine
    27206-27214
    2025-04-11
  • Verification of Neural Networks Against Convolutional Perturbations via Parameterised Kernels

    Benedikt Brückner, Alessio Lomuscio
    27215-27223
    2025-04-11
  • Risk Controlled Image Retrieval

    Kaiwen Cai, Chris Xiaoxuan Lu, Xingyu Zhao, Wei Huang, Xiaowei Huang
    27224-27232
    2025-04-11
  • Political Bias Prediction Models Focus on Source Cues, Not Semantics

    Selin Chun, Daejin Choi, Taekyoung Kwon
    27233-27241
    2025-04-11
  • Searching for Unfairness in Algorithms’ Outputs: Novel Tests and Insights

    Ian Davidson, S. S. Ravi
    27242-27249
    2025-04-11
  • In Search of Trees: Decision-Tree Policy Synthesis for Black-Box Systems via Search

    Emir Demirović, Christian Schilling, Anna Lukina
    27250-27257
    2025-04-11
  • Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution

    Carlos Eiras-Franco, Anna Hedström, Marina M.-C. Höhne
    27258-27267
    2025-04-11
  • Retrieving Versus Understanding Extractive Evidence in Few-Shot Learning

    Karl Elbakian, Samuel Carton
    27268-27276
    2025-04-11
  • LEGEND: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

    Duanyu Feng, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, Wenqiang Lei
    27277-27285
    2025-04-11
  • SMLE: Safe Machine Learning via Embedded Overapproximation

    Matteo Francobaldi, Michele Lombardi
    27286-27294
    2025-04-11
  • MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector

    Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang
    27295-27303
    2025-04-11
  • The Partially Observable Off-Switch Game

    Andrew Garber, Rohan Subramani, Linus Luu, Mark Bedaywi, Stuart Russell, Scott Emmons
    27304-27311
    2025-04-11
  • UFID: A Unified Framework for Black-box Input-level Backdoor Detection on Diffusion Models

    Zihan Guan, Mengxuan Hu, Sheng Li, Anil Kumar Vullikanti
    27312-27320
    2025-04-11
  • Robust Multi-Objective Preference Alignment with Online DPO

    Raghav Gupta, Ryan Sullivan, Yunxuan Li, Samrat Phatale, Abhinav Rastogi
    27321-27329
    2025-04-11
  • Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models

    Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho
    27330-27338
    2025-04-11
  • Joint Scoring Rules: Competition Between Agents Avoids Performative Prediction

    Rubi Hudson
    27339-27346
    2025-04-11
  • ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

    Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran
    27347-27355
    2025-04-11
  • Dynamic Algorithm Termination for Branch-and-Bound-based Neural Network Verification

    Konstantin Kaulen, Matthias König, Holger H. Hoos
    27356-27364
    2025-04-11
  • Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment

    Aidan Kierans, Avijit Ghosh, Hananel Hazan, Shiri Dori-Hacohen
    27365-27373
    2025-04-11
  • On the Consideration of AI Openness: Can Good Intent Be Abused?

    Yeeun Kim, Hyunseo Shin, Eunkyung Choi, Hongseok Oh, Hyunjun Kim, Wonseok Hwang
    27374-27382
    2025-04-11
  • Dynamic Back-Substitution in Bound-Propagation-Based Neural Network Verification

    Panagiotis Kouvaros, Benedikt Brückner, Patrick Henriksen, Alessio Lomuscio
    27383-27391
    2025-04-11
  • Maximizing Signal in Human-Model Preference Alignment

    Kelsey Kraus, Margaret Kroll
    27392-27400
    2025-04-11
  • Sequential Decision Making in Stochastic Games with Incomplete Preferences over Temporal Objectives

    Abhishek Ninad Kulkarni, Jie Fu, Ufuk Topcu
    27401-27409
    2025-04-11
16426 - 16450 of 25136 items << < 653 654 655 656 657 658 659 660 661 662 > >> 

Information

  • For Readers
  • For Authors
  • For Librarians
  • Part of the
    PKP Publishing Services Network

Copyright © 2024, Association for the Advancement of Artificial Intelligence

More information about the publishing system, Platform and Workflow by OJS/PKP.