NSGZero: Efficiently Learning Non-exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search

Wanqi Xue; Bo An; Chai Kiat Yeo

doi:10.1609/aaai.v36i4.20389

Authors

Wanqi Xue School of Computer Science and Engineering, Nanyang Technological University, Singapore
Bo An School of Computer Science and Engineering, Nanyang Technological University, Singapore
Chai Kiat Yeo School of Computer Science and Engineering, Nanyang Technological University, Singapore

DOI:

https://doi.org/10.1609/aaai.v36i4.20389

Keywords:

Domain(s) Of Application (APP)

Abstract

How resources are deployed to secure critical targets in networks can be modelled by Network Security Games (NSGs). While recent advances in deep learning (DL) provide a powerful approach to dealing with large-scale NSGs, DL methods such as NSG-NFSP suffer from the problem of data inefficiency. Furthermore, due to centralized control, they cannot scale to scenarios with a large number of resources. In this paper, we propose a novel DL-based method, NSGZero, to learn a non-exploitable policy in NSGs. NSGZero improves data efficiency by performing planning with neural Monte Carlo Tree Search (MCTS). Our main contributions are threefold. First, we design deep neural networks (DNNs) to perform neural MCTS in NSGs. Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources. Third, we provide an efficient learning paradigm, to achieve joint training of the DNNs in NSGZero. Compared to state-of-the-art algorithms, our method achieves significantly better data efficiency and scalability.

NSGZero: Efficiently Learning Non-exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information