Learning to Generate Industrial SAT Instances


  • Haoze Wu Stanford University
  • Raghuram Ramanujan Davidson College




In this paper, we present Satgen, the first implicit generative model of real-world Boolean Satisfiability (SAT) formulas. Our approach uses unsupervised machine learning techniques to generate new formulas by mimicking the structural properties of a given input formula Phi. We proceed in two phases: first, we construct the Literal-Incidence Graph (LIG) of Phi. This is used by a Generative Adversarial Network (GAN) to generate new LIGs that exhibit graph-theoretic properties similar to those of the LIG of Phi. In the second phase, we extract a formula whose LIG would correspond to the generated graph. We show that generating such a formula is equivalent to finding a minimal clique edge cover of the given graph, which we tackle efficiently using a greedy hill-climbing algorithm. We verify experimentally that our approach generates formulas that closely resemble a given real-world SAT instance, as measured by a range of different metrics.