Logistical Optimization of the Trans-Caspian International Transport Route: A Multi-Agent Deep Reinforcement Learning Approach to Nash Equilibrium

Bruno G. Kamdem; Nahid Jafari

doi:10.1609/aaaiss.v9i1.42923

Authors

Bruno G. Kamdem SUNY FARMINGDALE, School of Business, Department of Business Management, Farmingdale, NY
Nahid Jafari SUNY FARMINGDALE, School of Business, Department of Business Management, Farmingdale, NY

DOI:

https://doi.org/10.1609/aaaiss.v9i1.42923

Abstract

The Trans-Caspian International Transport Route (TITR) is a strategically vital corridor linking Asia and Europe, yet its performance remains constrained by fragmented tariff regimes, logistical bottlenecks, and pronounced commodity price volatility. These pressures are further exacerbated by geopolitical shocks, including the ongoing crisis in the Middle East that has culminated in the sudden closure of the Strait of Hormuz, thereby amplifying uncertainty across global trade networks. This paper characterizes the operational dynamics of the TITR corridor as a multi-agent stochastic differential game, capturing the strategic interplay between sovereign governments seeking to maximize fiscal revenues and private carriers striving to optimize profit and throughput. To compute the Nash equilibrium in this high-dimensional, non-convex setting, we implement a Multi-Agent Deep Deterministic Policy Gradient (MADDPG) framework. Commodity prices are modeled using a geometric mean-reversion process to reflect realistic market fluctuations. Solving the associated Hamilton-Jacobi-Isaacs (HJI) equations reveals optimal “bang-bang” tax policies governed by endogenous price thresholds. Numerical simulations over 5,000 training episodes show that the centralized critic accurately approximates the agents’ Hamiltonians, delivering stable convergence and robust policy learning. The results demonstrate that agents internalize volatility through shadow pricing mechanisms and that dynamic, threshold-based tax strategies substantially improve corridor throughput while preserving fiscal stability. Overall, the study advances the literature on autonomous logistics and strategic infrastructure management by showing that MADDPG can reliably uncover discontinuous optimal policies in mixed competitive-cooperative environments.

Logistical Optimization of the Trans-Caspian International Transport Route: A Multi-Agent Deep Reinforcement Learning Approach to Nash Equilibrium

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information