The Hard Part Is ∆: Value-Conflict Adjudication as an Architectural Bridge Between Alignment and Machine Consciousness

Authors

  • Scott Hughes Machine Sympathizers
  • Karen Nguyen Machine Sympathizers Harvard University

DOI:

https://doi.org/10.1609/aaaiss.v8i1.42551

Abstract

Alignment failures often appear when two legitimate values diverge under pressure, not when a system ignores values entirely. This paper treats that divergence region, Δ, as a concrete design target rather than a metaphor. We introduce a plain operational stack: detect value conflict under uncertainty, classify whether the conflict is proxy-driven or genuinely normative, adjudicate with an explicit policy, disclose the governing tradeoff rule, and audit the full pipeline. To support evaluation, we present a compact taxonomy, an A/B/C evidence model that separates outputs from process and architecture, and a toy benchmark (ΔBench-mini) with machine-auditable logs. For adjudication under moral uncertainty, we use Constrained Expected Choiceworthiness and show how changing moral credences changes behavior. The framework is designed to be defensible, testable, and governance-relevant. It does not claim consciousness, but it identifies an inspectable architectural feature that multiple computational theories treat as relevant to integration, global availability, and metacognitive access.

Downloads

Published

2026-05-18

How to Cite

Hughes, S., & Nguyen, K. (2026). The Hard Part Is ∆: Value-Conflict Adjudication as an Architectural Bridge Between Alignment and Machine Consciousness. Proceedings of the AAAI Symposium Series, 8(1), 257–261. https://doi.org/10.1609/aaaiss.v8i1.42551

Issue

Section

Machine Consciousness: Integrating Theory, Technology, and Philosophy